Kill Bill analytics guide

This is the reference guide to get started with Kill Bill analytics plugin.

Overview

Kill Bill maintains a reporting schema that is kept current in realtime for Analytics purposes. In contrary to other tables in the system, this set of tables is stable and potentially changes only between major revisions of Kill Bill. These tables are meant to be used as building blocks for reports and dashboards.

This analytics feature is provided as a Kill Bill plugin. To install it, simply install the killbill-osgi-bundles-analytics jar to /var/tmp/bundles/platform/ (or to your custom OSGI bundles directory if you have overridden the default path) and run the following DDL script.

For ease of tracking, Analytics objects contain id references to the Kill Bill ones but we strongly advise against using them for reporting purposes. Instead, use the external keys as much as possible, as they are tied to your business concepts. For instance, to count the number of accounts in the system, run a query using count(distinct account_key), not count(distinct account_id).

Data has been denormalized on purpose to avoid joins as much as possible.

Internals

This plugin listens to events generated by Kill Bill, and rebuild the complete state of its mirrored data upon receipt. The refresh jobs (one created per event) are kept in a notification queue. By default, it uses the same notification queue tables as the system, but we encourage you to use separate ones. To do so, set the following properties (the tables have been created by the provided ddl script):

org.killbill.notificationq.analytics.tableName=analytics_notifications
org.killbill.notificationq.analytics.historyTableName=analytics_notifications_history

These jobs are kept in the refresh-queue queue of the AnalyticsService service. For debugging purposes, search key 1 is the account record id and search key 2 is the tenant record id.

Currency conversion

Each table containing monetary columns (e.g. price, mrr, …) will also contain a currency column, expressing the currency of the monetary values. This can be difficult to work with as tables can contain rows in various currencies: any arithmetic across them won’t make any sense.

To work around this problem, the Analytics plugin has the concept of converted currency, which is a global constant currency of reference. Each monetary column will have a column associated to it with the converted value in this currency.

By default, converted values are all NULL. To enable this feature, you need to populate the analytics_currency_conversion table to contain the rates you’d like to use for various currencies, for a given period of time. For example, you may choose to populate the new rates at the beginning of each month.

By default, the reference currency is USD, but you can change it via:

com.ning.billing.osgi.bundles.analytics.referenceCurrency=EUR

Accounts

The table analytics_accounts maintains information about Kill Bill accounts. The table doesn’t contain historic records (it will contain at most one record per Kill Bill account).

Bundles

The table analytics_bundles maintains information about Kill Bill bundles. The table doesn’t contain historic records (it will contain at most one record per Kill Bill bundle).

Subscriptions

The table analytics_subscriptions maintains information about Kill Bill subscriptions. It describes the history of changes (both past and future) on a per subscription basis.

analytics_subscriptions will contain all phase transitions for all subscriptions and can be used to build subscription timelines. A few remarks:

Given a subscription id, events are sorted chronologically when stored (you can rely on record_id asc)
prev_ fields will always be NULL for creation events
next_ fields will always be NULL for cancellation events
A CHANGE event corresponds to a user initiated change (e.g. upgrade) whereas SYSTEM_CHANGE corresponds to a phase event (e.g. transition from trial to evergreen phase)
The requested_timestamp field reflects when the transition was requested, the next_start_date field reflects the effective transition time

Invoices

The tables analytics_invoices, analytics_invoice_adjustments, analytics_invoice_item_adjustments, analytics_invoice_credits and analytics_invoice_items maintain information about invoices posted by account.

Definitions:

invoice_balance: invoice_amount_charged + invoice_amount_credited + invoice_amount_adjusted_for_account_credit - (invoice_amount_paid + invoice_amount_refunded)
invoice_amount_charged: sum of all analytics_invoice_items, analytics_invoice_adjustments and analytics_invoice_item_adjustments items for that invoice
invoice_amount_credited: sum of all analytics_invoice_credits items for that invoice (positive if granted, negative if used)
invoice_amount_paid: sum of all analytics_payments items for that invoice
invoice_amount_refunded: sum of all analytics_chargebacks and analytics_refunds items for that invoice
invoice_original_amount_charged: sum of all analytics_invoice_items items for that invoice at the time of invoice creation

Note that invoice_amount_adjusted_for_account_credit is an internal adjustment that applies for account level credits only (CREDIT_ADJ items on their own invoices) and is not represented in Analytics. The full details of the computations can be found in InvoiceCalculatorUtils.java.

A few remarks:

EXTERNAL_CHARGE, FIXED and RECURRING items are treated as regular line items (stored in analytics_invoice_items)
ITEM_ADJ items are treated as invoice item adjustments (stored in analytics_invoice_item_adjustments)
CBA_ADJ items (gained or consumed) are treated as account credits (stored in analytics_invoice_credits)
REFUND_ADJ items are treated as invoice adjustments (stored in analytics_invoice_adjustments)
CREDIT_ADJ items that are not on their own invoice are treated as invoice adjustments (stored in analytics_invoice_adjustments). The internal name of CREDIT_ADJ is misleading in this case and will probably be renamed to INVOICE_ADJ in a future release. Note that these should not be mixed with account level credits: when an administrator creates an account credit (different from CBA credits generated by the system), a negative CREDIT_ADJ item is added to a new invoice. This triggers a positive CBA_ADJ item (to bring the balance back to zero), which effectively becomes the account credit (and eventually stored in analytics_invoice_credits). In this case, the CREDIT_ADJ item will not be stored directly in the analytics tables
Kill Bill repair items (REPAIR_ADJ) are not stored directly in any of these tables, but are combined with the new reparee item as invoice item adjustments (in analytics_invoice_item_adjustments). For example, if a recurring item of $20 was repaired into a $5 recurring item, analytics_invoice_items would contain the original $20 line item and analytics_invoice_item_adjustments would contain a $-15 adjustment item
Any invoice item has a revenue recognizable attribute which is used to tell if the item is recognizable from a finance perspective (tied to cash). This is used for credits only today and we assume that any administrator generated credit (CREDIT_ADJ and CBA_ADJ on their own invoice) is not recognizable. All other credit items, including system generated ones, are.

Payments

The table analytics_payments, analytics_refunds and analytics_chargebacks maintain information about Kill Bill payments, refunds and chargebacks.

Entitlement states

The table analytics_entitlement_states maintains information about Kill Bill entitlement states, including overdue statuses. The table contains historic status for a given account.

Tags and custom fields

Information about tags and custom fields are captured in the tables analytics_account_tags, analytics_bundle_tags, analytics_invoice_tags, analytics_payment_tags, analytics_account_fields, analytics_bundle_fields, analytics_invoice_fields and analytics_payment_fields. The tables don’t contain historic records (they will contain the current state of tags and custom fields).

Dashboards and reports

The Analytics plugin provides two types of APIs: one to retrieve and manipulate the data and one to create dashboards.

Data API

To retrieve all data for a given account:

curl -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     "http://127.0.0.1:8080/plugins/killbill-analytics/f6da101f-b43c-4bb7-920d-188eaa5ae5f8"

To force a refresh:

curl -v \
     -X PUT \
     -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     "http://127.0.0.1:8080/plugins/killbill-analytics/f6da101f-b43c-4bb7-920d-188eaa5ae5f8"

Dashboards

The Analytics plugin provides simple, yet powerful, dashboarding capabilities.

A dashboard is constituted of a number of reports, each of them being numbered, starting from 1. All reports are displayed in a single column, the report number 1 being the top most one. All reports share the same X axis.

Currently, we support displaying data as pie charts or time series. For the former, a label and count columns are required to describe sectors. For the latter, the only required column is day (reference time, used for the X axis), you can specify any number of dimensions and metrics. If multiple dimensions are specified, one line graph will be displayed for each dimension value. For example, if your dimensions are currency and country, one line per combination (and metric) will be drawn: one for USD/USA, USD/Mexico, MEX/USA, MEX/Mexico, etc. If multiple metrics are specified, each one of them (times the cardinality of dimensions) will be on its own line graph.

Reports configuration

Let’s say you want to look at the number of accounts created per day, the query may look like:

select
  tenant_record_id
, date_format(created_date, '%Y-%m-%d') as day
, count(*) as count
from analytics_accounts
where report_group = 'default'
group by 1,2
order by 1,2 asc
;

The first step is to create a view for the query:

create or replace view v_new_accounts_per_day as
select
  tenant_record_id
, date_format(created_date, '%Y-%m-%d') as day
, count(*) as count
from analytics_accounts
where report_group = 'default'
group by 1,2
order by 1,2 asc
;

You can now create the report:

curl -v \
     -X POST \
     -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     -H 'Content-Type: application/json' \
     -d '{"reportName": "new_accounts_per_day",
          "reportPrettyName": "New accounts created (per day)",
          "sourceTableName": "v_new_accounts_per_day"}' \
     "http://127.0.0.1:8080/plugins/killbill-analytics/reports"

Where:

reportName is a unique identifier for your report
reportPrettyName is a description, which will be used on the dashboards
sourceTableName is the name of the source table (view in our case)

If the query is expensive, you should think of materializing it on a regular basis. Assuming you have a stored procedure refresh_new_accounts_per_day which materializes v_new_accounts_per_day into new_accounts_per_day, you can tell the Analytics plugin to run it on a daily basis at 4am GMT:

curl -v \
     -X POST \
     -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     -H 'Content-Type: application/json' \
     -d '{"reportName": "new_accounts_per_day",
          "reportPrettyName": "New accounts created (per day)",
          "sourceTableName": "new_accounts_per_day",
          "refreshProcedureName": "refresh_new_accounts_per_day",
          "refreshFrequency": "DAILY",
          "refreshHourOfDayGmt": 4}' \
     "http://127.0.0.1:8080/plugins/killbill-analytics/reports"

Or alternatively, you can run it on an hourly basis (it will run at 5' past the hour):

curl -v \
     -X POST \
     -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     -H 'Content-Type: application/json' \
     -d '{"reportName": "new_accounts_per_day",
          "reportPrettyName": "New accounts created (per day)",
          "sourceTableName": "new_accounts_per_day",
          "refreshProcedureName": "refresh_new_accounts_per_day",
          "refreshFrequency": "HOURLY"}' \
     "http://127.0.0.1:8080/plugins/killbill-analytics/reports"

Here is how you can create such a stored procedure:

create table new_accounts_per_day as select * from v_new_accounts_per_day limit 0;

drop procedure if exists refresh_new_accounts_per_day;
DELIMITER //
CREATE PROCEDURE refresh_new_accounts_per_day()
BEGIN
delete from new_accounts_per_day;
insert into new_accounts_per_day select * from v_new_accounts_per_day;
END;
//
DELIMITER ;

To retrieve your report configuration, use the report name:

curl -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     "http://127.0.0.1:8080/plugins/killbill-analytics/reports/new_accounts_per_day"

To update it (you can edit all fields but the reportName one):

curl -v \
     -X PUT \
     -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     -H 'Content-Type: application/json' \
     -d '{"reportPrettyName": "New accounts created (refreshed at 2am GMT)",
          "refreshFrequency": "DAILY",
          "refreshHourOfDayGmt": 2}' \
     "http://127.0.0.1:8080/plugins/killbill-analytics/reports/new_accounts_per_day"

Finally, to delete it:

curl -v \
     -X DELETE \
     -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     "http://127.0.0.1:8080/plugins/killbill-analytics/reports/new_accounts_per_day"

You can download the data behind your report as csv or json:

curl -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
     'http://127.0.0.1:8080/plugins/killbill-analytics/reports?name=new_accounts_per_day&format=csv'

We provide a set of canned reports that can be useful as a starting point. See https://github.com/killbill/killbill-analytics-plugin/tree/master/src/main/resources/.

Dashboard API

To create a dashboard, go to http://127.0.0.1:8080/plugins/killbill-analytics/static/analytics.html.

The dashboard system is controlled by query parameters:

report1, report2, etc.: report name (from the configuration). The number determines in which slot the data should be displayed, starting from the top of the page. For example, report1=trials&report1=conversions&report1=cancellations&report2=accounts will graph the trials, conversions and cancellations reports in the first slot (on the same graph), and the accounts report below (in slot 2)
startDate and endDate: dates to filter the data on the server side. For example: startDate=2012-08-01&endDate=2013-10-01
smooth1, smooth2, etc.: smoothing function to apply for data in a given slot. Currently support smoothing functions are:
- AVERAGE_WEEKLY: average the values on a weekly basis
- AVERAGE_MONTHLY: average the values on a monthly basis
- SUM_WEEKLY: sum all values on a weekly basis
- SUM_MONTHLY: sum all values on a monthly basis
title1, title2, etc.: titles for each graph. If not specified, the default title (configured in the database) is used

You can narrow down the dimensions and metrics you want to plot for each report, and even specify filters. This is useful if the underlying view is a cube, for example, and you want to extract part of the data.

The URL format for each report looks like: report1=payments_per_day^dimension:currency^dimension:state^metric:amount^metric:fee or if you want to override the legend: report1=payments_per_day(Payments per day)^dimension:currency^dimension:state^metric:amount^metric:fee

You can specify as many dimensions and metrics as you want (each one of them should correspond to a column name in your table or view, and metrics are expected to be representable by floats).

Aggregate functions are supported for metrics: report1=payments_per_day^dimension:currency^dimension:state^metric:sum(amount)^metric:count(distinct fee)^metric:100*sum(fee)/amount

You can also use the filter keywork to narrow down your dataset, for example: report1=payments_per_day^dimension:currency^dimension:state^metric:amount^metric:fee^filter:(currency=USD%26state!=ERRORED)|(currency=EUR%26state=PROCESSED)

Note that & is represented as %26, to avoid making the server interpret it as a standalone query parameter.

You can have as many filters as you want, they will be ORed as a single filter in the query.

Finally, you can group certain rows together using the following syntax on dimensions (in this example, 3 groups and an Other will be created): report1=payments_per_day^dimension:currency(USD|EUR|BRL,GBP,MXN,AUD)^dimension:state^metric:amount^metric:fee

You can also setup aliases and/or skip Other as such: report1=payments_per_day\^dimension:currency(USD=Main currency|EUR=Second currency|BRL,GBP,MXN,AUD=Third currency|-)^{dimension:state\}metric:amount^metric:fee

For debugging purposes, you can check the SQL generated by appending the sqlOnly parameter to the following endpoint:

curl -u admin:password \
     -H "X-Killbill-ApiKey:bob" \
     -H "X-Killbill-ApiSecret:lazar" \
    'http://127.0.0.1:8080/plugins/killbill-analytics/reports?name=payments_per_day^dimension:currency^dimension:state^metric:amount^metric:fee^filter:(currency=USD%26state!=ERRORED)|(currency=EUR%26state=PROCESSED)&sqlOnly=true'

To continue with our example above, you can see the report by going to http://127.0.0.1:8080/plugins/killbill-analytics/static/analytics.html?report1=new_accounts_per_day.