Showing posts with label PHP. Show all posts
Showing posts with label PHP. Show all posts

Sunday, January 01, 2017

Using AWS Machine Learning from ABAP to predict runtimes

Happy new year everybody!

Today I tried out Amazon's Machine Learning capabilities. After running over the basic AWS Machine Learning tutorial and getting to know how the guys at AWS deal with the subject I got quite exited.



Everythings sounds quite easy:

  1. Prepare example data in a single CSV file with good and distinct features for test and training purposes
  2. Create a data source from that CSV file, which basically means verifying that the column types were detected correctly and specifying a result column. 
  3. Create a Machine Learning model from the data source, running an evaluation on it
  4. Create an Endpoint, so your model becomes consumable via a URL based service

My example use case was to predict the runtime of one of our analysis tools - SNP System Scan - given some system parameters. In general any software will probably benefit from good runtime predictions as this is a good way to improve the user experience. We all know the infamous progress bar metaphor that quickly reaches 80% but then takes ages to get to 100%. As a human being I expect progress to be more... linear ;-)


So this seems like a perfect starting point for exploring Machine Learning. I got my data perpared and ran through all the above steps. I was dealing with numerical and categorical columns with my datasource but also boolean and text are available. Text is good for unstructured data such as natural language analysis, but I did not get into that yet. Everything so far was quite easy and went well.

Now I needed to incorporate the results into the software, which is in ABAP. Hmmm, no SDK for ABAP. Figured! But I still want to enable all my colleagues to take advantage of this new buzzword techology and play around with it. I decided for a quick implementation using the proxy pattern.


So I have created an ABAP based API that calls a PHP based REST Service via HTTP, which then utilizes the PHP SDK for AWS to talk to the AWS Machine Learning Endpoint I previously created.

For the ABAP part I wanted to be both as easy and as generic as possible, so the API should work with any ML model and any record structure. The way that ABAP application developers would interact with this API would look like this:


REPORT  /snp/aws01_ml_predict_scan_rt.

PARAMETERS: p_comp TYPE string LOWER CASE OBLIGATORY DEFAULT 'SAP ECC 6.0'.
PARAMETERS: p_rel TYPE string LOWER CASE OBLIGATORY DEFAULT '731'.
PARAMETERS: p_os TYPE string LOWER CASE OBLIGATORY DEFAULT 'HP-UX'.
PARAMETERS: p_db TYPE string LOWER CASE OBLIGATORY DEFAULT 'ORACLE 12'.
PARAMETERS: p_db_gb TYPE i OBLIGATORY DEFAULT '5000'. "5 TB System
PARAMETERS: p_uc TYPE c AS CHECKBOX DEFAULT 'X'. "Is this a unicode system?
PARAMETERS: p_ind TYPE string LOWER CASE OBLIGATORY DEFAULT 'Retail'. "Industry
PARAMETERS: p_svers TYPE string LOWER CASE OBLIGATORY DEFAULT '16.01'. "Scan Version

START-OF-SELECTION.
  PERFORM main.

FORM main.
*"--- DATA DEFINITION -------------------------------------------------
  "Definition of the record, based on which a runtime predition is to be made
  TYPES: BEGIN OF l_str_system,
          comp_version TYPE string,
          release TYPE string,
          os TYPE string,
          db TYPE string,
          db_used TYPE string,
          is_unicode TYPE c,
          company_industry1 TYPE string,
          scan_version TYPE string,
         END OF l_str_system.

  "AWS Machine Learning API Class
  DATA: lr_ml TYPE REF TO /snp/aws00_cl_ml.
  DATA: ls_system TYPE l_str_system.
  DATA: lv_runtime_in_mins TYPE i.
  DATA: lv_msg TYPE string.
  DATA: lr_ex TYPE REF TO cx_root.

*"--- PROCESSING LOGIC ------------------------------------------------
  TRY.
      CREATE OBJECT lr_ml.

      "set parameters
      ls_system-comp_version = p_comp.
      ls_system-release = p_rel.
      ls_system-os = p_os.
      ls_system-db = p_db.
      ls_system-db_used = p_db_gb.
      ls_system-is_unicode = p_uc.
      ls_system-company_industry1 = p_ind.
      ls_system-scan_version = p_svers.

      "execute prediction
      lr_ml->predict(
        EXPORTING
          iv_model   = 'ml-BtUpHOFhbQd' "model name previously trained in AWS
          is_record  = ls_system
        IMPORTING
          ev_result  = lv_runtime_in_mins
      ).

      "output results
      lv_msg = /snp/cn00_cl_string_utils=>text( iv_text = 'Estimated runtime of &1 minutes' iv_1 = lv_runtime_in_mins ).
      MESSAGE lv_msg TYPE 'S'.

    CATCH cx_root INTO lr_ex.

      "output errors
      lv_msg = lr_ex->get_text( ).
      PERFORM display_lines USING lv_msg.

  ENDTRY.

ENDFORM.

FORM display_lines USING iv_multiline_test.
*"--- DATA DEFINITION -------------------------------------------------
  DATA: lt_lines TYPE stringtab.
  DATA: lv_line TYPE string.

*"--- PROCESSING LOGIC ------------------------------------------------
  "split into multiple lines...
  SPLIT iv_multiline_test AT cl_abap_char_utilities=>newline INTO TABLE lt_lines.
  LOOP AT lt_lines INTO lv_line.
    WRITE: / lv_line. "...and output each line individually
  ENDLOOP.

ENDFORM.

Now on the PHP side I simply used the AWS SDK for PHP. Setting it up is as easy as extracting a ZIP file, require the auto-load mechanism and just use the API. I wrote a little wrapper class that I could easily expose as a REST Service (not shown here).

<?php

class SnpAwsMachineLearningApi {

   /**
   * Create an AWS ML Client Object
   */
   private function getClient($key,$secret) {
      return new Aws\MachineLearning\MachineLearningClient([
         'version' => 'latest',
         'region'  => 'us-east-1',
         'credentials' => [
            'key'    => $key,
            'secret' => $secret
         ],
      ]);
   }

   /**
   * Determine the URL of the Model Endpoint automatically
   */
   private function getEndpointUrl($model,$key,$secret) {

      //fetch metadata of the model
      $modelData = $this->getClient($key,$secret)->getMLModel([
         'MLModelId'=>$model,
         'Verbose'=>false
      ]);

      //check if model exists
      if(empty($modelData)) {
         throw new Exception("model ".$model." does not exist");
      }

      //getting the endpoint info
      $endpoint = $modelData['EndpointInfo'];

      //check if endpoint was created
      if(empty($endpoint)) {
         throw new Exception("no endpoint exists");
      }

      //check if endpoint is ready
      if($endpoint['EndpointStatus'] != 'READY') {
         throw new Exception("endpoint is not ready");
      }

      //return the endpoint url
      return $endpoint['EndpointUrl'];
   }

   /**
   * Execute a prediction
   */
   public function predict($model,$record,$key,$secret) {
      return $this->getClient($key,$secret)->predict(array(

          //provide the model name
         'MLModelId'       => $model,

         //make sure it's an associative array that is passed as the record
         'Record'          => json_decode(json_encode($record),true),

         //determine the URL of the endpoint automatically, assuming there is
         //only and exactely one
         'PredictEndpoint' => $this->getEndpointUrl($model,$key,$secret)
      ));
   }

}

And that is basically it. Of course for the future it would be great to get rid of the PHP part and have an SDK implementation purely ABAP based but again, this was supposed to be a quick and easy implementation.

Currently it enables ABAP developers to execute predictions on AWS Machine Learning Platform on any trained model without having to leave their terrain.

In the future this could be extended to initially providing or updating datasources from ABAP internal tables, creating and training models on the fly and of course abstracting stuff even so far, that other Machine Learning providers can be plugged in. So why not explore the native SAP HANA capabilities next...

Saturday, August 26, 2006

Observers and Subjects

The most well-known design pattern for decoupling object instances or classes from one another and still have a mechanism for the observer to know, when a particular event occurs on a subject, is the Observer-Pattern. The implementation is typically done either using abstract classes or using interfaces.

While the prior has the advantage that the abstract classes may provide a reusable behaviour for observer registration and notification it limits an observer to stay in the observer role. It's not possible for an observer of one subject, to act as a subject itself (unless, you are relying on implicit coding/naming conventions rather than on an explicit mechanism). While this may not be a problem with languages allowing multiple inhertiance, it is for languages that support single inheritance only.

The latter (inteface-based) approach, releaves you of this issue at the cost of implementing the required methods on every class that wants to make use of the pattern. While this may be inconvenient it is a valid means to resolve and works most of the time.

With dynamic languages such as PHP or JavaScript there is yet another way to approach the issue using a mediating class that organizes the eventhandling between abitrary objects (or classes or basically any entity you may think of). Let's show an example for the usage of this approach:

class Event { ... }


class Subject {

 public function doSomething() {
  //do something and throw the event
  Event::raise($this,'onDoSomething');
 }

}


class Observer {

 public function onEvent(Event $event) {
  //react to the event
  echo get_class($event->getSourceObject());
 }

}

$subject = new Subject();
$observer = new Observer();
Event::register($subject,'onDoSomething',$observer,'onEvent');
$subject->doSomething(); //prints 'Subject'

What you are now probably interested in the most is the implementation of the "Event"-class. As you have seen there are two static methods on it:

1) register() allows to register abitrary methods or functions to be registered as callbacks for a particular event - 'onDoSomething' in this case - of objects or classes (the latter is not yet supported by the following implementation)

2) raise() allows to indicate the occurrence of a particular event on an object or class (the latter is not yet supported by the following implementation). All previously registered callback methods (you may as well call them the observers) will be executed using a single argument: an "Event"-object instance.

This object may be used to query for particular aspects of the event happening using one of the following methods.

1) getSourceObject() returns the subject the event occurred on if existing

2) getName() returns the name of the event so that the observer may decide how to handle different types of events, if this is not already distinguished using different callback methods.

3) getTargetObject() returns a reference to the observer if it is an object instance

4) getTargetMethod() returns the name of the callback method

5) getParameters() return an array of additional parameters that the subject may hand over to raise().

A possible implementation of the "Event"-class may look shown below. Please note that this version currently only supports object-to-object event handling:

class Event {

    protected $targetObject;
    protected $targetMethod;
    protected $eventName;
    protected $sourceObject;
    protected $parameters;

    protected function __construct($targetObject,
                                   $targetMethod,
                                   $eventName,
                                   $sourceObject,
                                   array $parameters = array()) {
        $this->targetObject = $targetObject;
        $this->targetMethod = $targetMethod;
        $this->eventName = $eventName;
        $this->sourceObject = $sourceObject;
        $this->parameters = $parameters;
    }

    public function getSourceObject() {
        return $this->sourceObject;
    }

    public function getName() {
        return $this->eventName;
    }

    public function getTargetMethod() {
        return $this->targetMethod;
    }

    public function getTargetObject() {
        return $this->targetObject;
    }

    public function getParameters() {
        return $this->parameters;
    }

    public static function raise($onObject,
                                 $event,
                                 array $parameters = array()) {
        foreach($onObject->__eventRegistry[$event] as $target) {
            if($target[0]) {
                call_user_func_array($target,
                                     array(
                                         new Event(
                                             $target[0],
                                             $target[1],
                                             $event,
                                             $onObject,
                                             $parameters)
                                     )
                );
            } else {
                call_user_func_array($target[1],
                                     array(
                                         new Event(
                                             NULL,
                                             $target[1],
                                             $event,
                                             $onObject,
                                             $parameters)
                                     )
                );
        }
    }

    public static function register($sourceObject,
                                    $forEvent,
                                    $targetObject,
                                    $targetMethod) {
        $sourceObject->__eventRegistry[$forEvent][] = array($targetObject,$targetMethod);
    }
}

This implementation should be session-save as well, as the dynamically built "__eventRegistry" should be serialized to and restored from the session automatically. For class based event registries this may be a little more tricky to implement.

But once the "strict" keywork is incorporated into PHP, this approach will not work on classes anymore that are marked strict.

Saturday, August 19, 2006

Enumerations in PHP

Sometimes you stumble over stuff on the net and really think to yourself. Wow! Why does not everybody talk about it yet? My Subject of the day is Enumerations in PHP via SPL_Types. Basically it brings Enumerations to PHP, ensuring that a variable may only contain specific values based on a value domain. This should be looking something like this.


class Weekday extends SplEnum
{
  const Sunnday = 0;
  const Monday = 1;
  const Tuesday = 2;
  const Wednesday = 3;
  const Thursday = 4;
  const Friday = 5;
  const Saturday = 6;
  const __default = Weekday::Sunday;
}

$e = new Weekday;

var_dump($e); // shows object of type SplEnum
var_dump((int)$e); // int(0)

$e++;

var_dump($e); // shows object of type SplEnum
var_dump((int)$e); // int(1)
var_dump($e + 3); // int(4)
I have found this example over at a mailing list transcript at BeebleX. While there is certainly a trace of this functionality in the CVS Repository at php.net you cannot yet get a precompiled version of the extension at PHP-Snaps :-(.

Personal note: While I love PHP for the fact that it's a dynamically typed language, I consider Enumerations a big help, to communicate and ensure consistency of parameter values on public interfaces.

Validating user input using PECL::filter

PECL::filter is a new promising PHP extension, which will come with the new PHP 5.2.0. It is mainly meant for validation of input, which currently requires quite some work. Since there is no documentation over at php.net you can read all about it over here. For all the guys using Windows for your development environment here you can download the compiled extension for your current PHP installment.

Thursday, August 10, 2006

Calculating distances

Since I seem to have a fav on geo-calculation since Google-Maps has hit me, I have been looking for some information how to calculate distances between to points on the globe.

The GoogleAPI actually has a method to calculate the distance between points based on a shperical model.

But what if you'd like to implement a radius search on the database or calculate distances on the backend. I found a nice introduction for the subject over at MeridianWorldData, which takes into account various levels of precision to gain performance, where it is needed.

Also I found a PHP based example for distance calculation as well as a series of articles on the subject of GIS mapping.

Mashup GoogleMaps and GeoIP

Ok I have put together a small example of how to use GoogleMaps and GeoIP. Use it to look up where a specific URL or server is hosted.

You can view it here. The application includes a bookmarklet, which you may bookmark or place in the link's bar of your browser to use it from every website to determine it's location.

For all you PHP freaks, you can download the complete example as well. Just unpack the ZIP file on your PHP5-enabled server and you should be good to go. Just make sure to use your own GoogleMaps API key, you can get here.

Wednesday, August 09, 2006

Matching IP to country

I have again found a nice article about how to match IPs to countries over at builder.com. It uses PHP::Pear's Net_GeoIP class for the querying. But what really interest me is the data behind it. Luckily MaxMind does provide this data in various flavors.

There is a free CSV export at the beginning of every month, which can be downloaded on country level and on city level.

However there are also 2 APIs for PHP. For once there is a pure PHP implementation and also a PHP extension as well.

I am definately going to try these out!

Tuesday, August 08, 2006

Creating MS Office documents using PHP

I have recently discovered an article on how to generate MS Office Documents using PHP. Basically this article does not contain anything new, demonstrating how to utilize the COM API to construct Word and Powerpoint and how Excel can be created from an HTML table. While this article will not enable you to approach the issue in an OS independent way (you will in fact need a windows server to create Word and Powerpoint documents) it still assembles the concepts for all 3 major MS Office products, containing easy to understand code snippets.

24 ways to impress your friends

24 ways owns a collection of 24 very nice articles about bleeding edge JavaScript and CSS hacks to make things look good and work well. Definately worth a look or two if you are in the web business.

Tuesday, July 18, 2006

dompdf - The PHP 5 HTML to PDF Converter

Finally there seem to be a decent API for PHP developers to render HTML to PDF. I have just downloaded and unpacked domPdf on my WAMP and it worked right away.

Ok I had to disable php_domxml.dll first but after that it worked perfectly. Well almost, since there seems to be a problem with special characters. I guess I need to import some additional forms first before this works correctly.

Anyhow this API is a big step forward for PHP developers trying to find an easy and convenient way to create PDFs from what they know best: PHP and HTML.

Sunday, July 09, 2006

Patterns For PHP - WebSites

Since PHP-Patterns does not seem to get any new content for quite awhile (probably because Harry Fuechs has moved his activities over to Sitepoint), the new Patterns For PHP website seems to take the same approach of an open wiki on the subjects of design patterns for PHP.

Hopefully this time there will be a longer lasting effort on the subject since I personally like pattern-pages very much. For all those of you that are craving patterns on an abstract non-implementational level Martin Fowler's website might be worth a look.