Tuesday, October 18, 2011

Use Case: Grouping by Time, A Call Center Story

Recently a company that operates a rather large call center came to Pentaho to help them be able to better analyze their data and gain better insight to the performance and efficiency of their agents.  One of the issues they had was that they wanted to see their call center data grouped by particular time categories so that they can better manage staffing for peak hours of the day as well as view who are the top performing agents in terms of volume.  They had some basic information but the problem was that they wanted to have the grouping within the database so that they can gain some additional performance in pre-aggregating the data, due to the large volume of calls that are occurring.  Here is a sample of what the data looked like (the sample below is only sample data generated by a sample data generator).



The first step in this process is to make the data "BI Ready", in other words take the data and convert to information by enriching it and grouping it by the designated time buckets, for this particular example they want to bucket their time in 15 minute increments starting from 8:00 to 5:00 (the operating hours of the call center).  In this example, I am only using a snapshot of data ranging from 8:00 AM to 9:00 AM.  This will allow them to answer questions like

  • During what time span (within 15 minutes) is the call volume the highest?
  • What time categories have the highest volume of incoming calls?
  • What time categories have the highest volume of outbound calls?


The end transformation looks like this:



Transformation Detailed Description

Read Call Center Data from CSV - This is a csv Input Step that reads in the data from the CSV file.  This does not have to be a csv file but can be almost any flat file, database, web service, etc.

Convert Date-Time to String - This is a Select Values step that allows you to rename or change the type of any of your fields, in this step we are converting the date-time  to a string in order to split it into two different fields, one field that contains only date and the other only time.

Split Date-Time Field - Field Splitter allows you to split any string field based on a common requirement.  In this example because the date-time field called BeginDate in our csv file has a date-time field that is a format where the date and time are separated by a space, we use that as the delimiter to split the field.

Convert Time to Number - The convert time to number is used to convert the Time field that we created in the previous step to a number in order to do our grouping into time ranges.

Create Time Range - This step will create a new column called TimeGroup, read in the rows of data, look at the time field set the appropriate value to the newly created field.  For this example we are creating four groups:  8:00 AM to 8:15 AM, 8:15 AM to 8:30 AM, 8:30 AM to 8:45 AM, and 8:45 AM to 9:00 AM.

Convert Time to Time Format - This step converts the time field back to the time data type for reporting purposes.

Dummy (do nothing) - this step, as it implies does nothing, at this point you can load the data directly into a database, flat file, etc.

The end result of this transformation looks like this:



Download the zip file here for youto run this transformation.  In order for it to work, you will need to import the .ktr file and then make sure to change the path of the csv file in the first step to the location of where you saved the csv file that is included in the .zip file.

Pentaho Data Integration Sample Library


When you download Pentaho and run our Data Integration client tool (also known as Spoon), you also get a huge library of sample transformations that can help you build out your solution.  The great thing about these sample transformation and jobs is that they are all working samples, so you can test and learn by following the entire flow.  There are over 150 working sample transformation and over 40 working jobs.  You can access these transformation and jobs anytime by going to the following directory:

[Install Directory]\pentaho\design-tools\data-integration\samples

To use any of these transformation in Pentaho Data Integration follow these three easy steps:

  1. Start Pentaho Data Integration Client Tool (Spoon)
  2. Go to File > Import from an XML File


  3.   Browse to the samples library location stated above and select the sample you wish to view and run.