Finding the Current Kill Bill Version/Plugin Information

You can find the current Kill Bill version/information about plugins by invoking the following endpoint:

curl -v \
     -u admin:password \
     http://127.0.0.1:8080/1.0/kb/nodesInfo

This displays a response similar to the following:

[
  {
    "nodeName": "DESKTOP-NCUS49R",
    "bootTime": "2022-08-10T06:30:52.000Z",
    "lastUpdatedDate": "2022-08-10T06:30:52.000Z",
    "kbVersion": "0.22.18",
    "apiVersion": "0.53.17",
    "pluginApiVersion": "0.26.4",
    "commonVersion": "0.24.9",
    "platformVersion": "0.40.4",
    "pluginsInfo": [
      {
        "bundleSymbolicName": "org.kill-bill.billing.plugin.java.analytics-plugin",
        "pluginKey": "analytics",
        "pluginName": "analytics-plugin",
        "version": "7.3.0-SNAPSHOT",
        "state": "STOPPED",
        "isSelectedForStart": true,
        "services": []
      }
    ]
  }
]

Alternatively, you can also find the Kill Bill version via Kaui by accessing http://<KAUI_HOME>/kpm.

Healthchecks and metrics

Kill Bill exposes both healthchecks and metrics. These can be accessed via:

curl -v http://<IP_ADDRESS>:8080/1.0/healthcheck?pretty=true
curl -v http://<IP_ADDRESS>:8080/1.0/metrics?pretty=true

Remote Debugging

Remote debuggging Kill Bill running in Tomcat

In order to remote debug Kill Bill running in Tomcat, the following needs to be done:

  1. Ensure that you have Kill Bill and Kaui installed in Tomcat as explained in the Getting Started guide.

  2. Set the following environment variable:

    JPDA_OPTS=-agentlib:jdwp=transport=dt_socket,address=12345,server=y,suspend=n
  3. Start Tomcat as follows:

    catalina.sh jpda start
  4. Create a new remote debug configuration in Eclipse/Intellij with Host=localhost and Port=12345.

  5. Add a breakpoint to the code.

  6. Execute any action that invokes the code with the breakpoint. This will run the code in debug mode.

Remote debuggging Kill Bill running in Docker

In order to remote debug Kill Bill running in Docker, the following needs to be done:

  1. Expose port 12345 and specify the JVM_JDWP_PORT environment variable in your docker-compose file as follows:

    version: '3.2'
    volumes:
      db:
    services:
      killbill:
        image: killbill/killbill:0.24.10
        ports:
          - "8080:8080"
          - 12345:12345
        environment:
          - JVM_JDWP_PORT=*:12345
          - KILLBILL_DAO_URL=jdbc:mysql://db:3306/killbill
          - KILLBILL_DAO_USER=root
          - KILLBILL_DAO_PASSWORD=killbill
          - KILLBILL_CATALOG_URI=SpyCarAdvanced.xml
          - KB_org_killbill_server_test_mode=true
      kaui:
        image: killbill/kaui:3.0.9
        ports:
          - "9090:8080"
        environment:
          - KAUI_CONFIG_DAO_URL=jdbc:mysql://db:3306/kaui
          - KAUI_CONFIG_DAO_USER=root
          - KAUI_CONFIG_DAO_PASSWORD=killbill
          - KAUI_KILLBILL_URL=http://killbill:8080
      db:
        image: killbill/mariadb:0.24
        volumes:
          - type: volume
            source: db
            target: /var/lib/mysql
        expose:
          - "3306"
        environment:
          - MYSQL_ROOT_PASSWORD=killbill
  2. Create a new remote debug configuration in Eclipse/Intellij with Host=localhost and Port=12345.

  3. Add a breakpoint to the code.

  4. Execute any action that invokes the code with the breakpoint. This will run the code in debug mode.

Alternatively, you may also start Kill Bill in Docker using the following command at step 1:

docker run -p 12345:12345 -p 8080:8080 -e 'JVM_JDWP_PORT=*:12345' killbill/killbill:0.24.1

Other Notes

  • To debug instances remotely (e.g. in the cloud), you need to specify the publicly accessible IP address of your instance.

  • Make sure also that the port 12345 is open (e.g. check your security groups in AWS)

Turning on DEBUG logging

Specify:

-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=8000
-Dcom.sun.management.jmxremote.rmi.port=8000
-Dcom.sun.management.jmxremote.ssl=false
-Djava.rmi.server.hostname=localhost

on startup in your JVM settings to enable access via jconsole.

You can set at runtime the log level of specific classes and packages (look for the ch.qos.logback.classic:Name=default,Type=ch.qos.logback.classic.jmx.JMXConfigurator MBean).

Similar considerations as the remote debugging section apply regarding the IP address and expose ports (see above). For Docker, java.rmi.server.hostname should point to the host IP address (e.g. 192.168.99.100). Passing this IP address to our Docker images as ENV_HOST_IP enables jConsole automatically.

Following tokens

To understand why a specific action was taken, it is often useful to follow tokens.

When invoking a JAX-RS (HTTP) API, a userToken is automatically generated. You can also specify your own token via the X-Request-Id HTTP header (it must be a UUID).

Once set, this token is:

  • attached to the processing thread

  • attached to SLF4J MDC context as kb.userToken and displayed by default in logs (tok=)

  • passed to all APIs as CallContext#getUserToken()

  • passed to internal bus events as BusEvent#getUserToken()

  • passed to external bus events as ExtBusEvent#getUserToken()

  • passed to future notifications as userToken

Note that when a notification is created, a futureUserToken is auto-generated. That futureUserToken becomes the userToken when the notification is dispatched.

You can search for these tokens in Kaui, on the Queues screen, accessible from the tenant page or via the per-account sub-menu (you need to be logged-in as the Kaui root user).

Testing changes over time

We have a clock abstraction in Kill Bill that can be manipulated through an API, as long as you start Kill Bill with org.killbill.server.test.mode=true (-e KILLBILL_SERVER_TEST_MODE=true in Docker):

curl -v \
     -u admin:password \
     -H "X-Killbill-ApiKey: bob" \
     -H 'X-Killbill-ApiSecret: lazar' \
     -H 'X-Killbill-CreatedBy: admin' \
     "http://127.0.0.1:8080/1.0/kb/test/clock"

curl -v \
     -u admin:password \
     -H "X-Killbill-ApiKey: bob" \
     -H 'X-Killbill-ApiSecret: lazar' \
     -H 'X-Killbill-CreatedBy: admin' \
     -H "Accept: application/json" \
     -X POST \
     "http://127.0.0.1:8080/1.0/kb/test/clock?requestedDate=2015-12-14T23:02:15.000Z"

curl -v \
     -u admin:password \
     -H "X-Killbill-ApiKey: bob" \
     -H 'X-Killbill-ApiSecret: lazar' \
     -H 'X-Killbill-CreatedBy: admin' \
     -H "Accept: application/json" \
     -X PUT \
     "http://127.0.0.1:8080/1.0/kb/test/clock?days=10"

Here is an example how it could be used in tests.

Docker and Ansible

Our Docker images use Ansible behind the scenes to install and configure Kill Bill in the container. The actual Ansible commands run are available in the images via the environment variables $KPM_INSTALL_CMD and $START_TOMCAT_CMD.

If you need to debug this setup process, you can start a container via docker run -ti killbill/killbill:0.22.0 bash and execute these commands manually.

The Ansible command to start Tomcat can also be modified via the $START_TOMCAT_OPTS variable. One useful setting is to create the container using START_TOMCAT_OPTS='--skip-tags tomcat_cleanup' (e.g. docker run -e START_TOMCAT_OPTS='--skip-tags tomcat_cleanup' -ti killbill/killbill:0.22.0). While by default Ansible will cleanup the Tomcat directories (such as webapps/ROOT), this option will skip this task. This is especially useful when developping on Kill Bill: you can replace specific jars in the webapps/ROOT directory and simply restart the container to test your changes.

Performance and profiling

Kill Bill has some built-in support to profile incoming calls. The mechanism is based on HTTP headers that need to be set to specify what needs to be profiled, and as a result, a new HTTP header is returned.

Types of profiling

In order to return profiling information, one needs to add the HTTP header X-Killbill-Profiling-Req and specify which calls should be instrumented. Today, we support:

  • JAXRS: The HTTP call will be returned with the time spent in the call.

  • API: Any Kill Bill API call will be returned with the time spent in the call

  • DAO: Any Kill Bill DAO (request to database) call will be returned with the time spent in the call

  • DAO_DETAILS: Any Kill Bill DAO (request to database) call along with specifics about time time spent for audit/history and caching will be returned with the time spent in the call

  • DAO_CONNECTION: The time spent trying to acquire a database connection

  • GLOCK: The time spent trying to acquire the global (Account) lock

  • PLUGIN: Any Kill Bill call spent when calling a plugin will be returned with the time spent in the call

JAXRS Profiling

The time is extracted from a Jersey filter, so it should be very close to the time spent into the JAX-RS application handler, not taking into account the time spent in the container itself (i.e. Jetty, Tomcat, …​).

API Profiling

We are relying on AOP to profile each of our API calls. We are injecting a Guice AOP module to profile each of our API calls. The time returned does not include the extra time spent to dispatch the request using the AOP handler.

In addition, there is another AOP module that is used to validate the user has the right set of permissions to execute this call. Since this occurs prior we do the profiling this time is currently not included in the profiling of the API call.

Finally when making API calls from plugins, there may be also a small overhead to go through OSGI layer and this time is also not included.

DOA Profiling

Most of the requests issued from Kill Bill Dao (e.g DefaultPaymentDao) classes follow the same mechanism:

  1. A transaction is started by calling the execute method of the EntitySqlDaoTransactionalJdbiWrapper. Here, we create a Handle, essentially retrieving a database connection. The time taken to get the connection can be profiled by using the DAO_CONNECTION mask.

  2. Then, a jdbi transaction is started using cglib, and this transaction is actually run through a series of handlers, such as our RestartTransactionRunner. This extra time, is not being measured by our profiling layer.

  3. Finally, for each SQL operation within the transaction, we also proxy the calls using the java Proxy mechanism, and so all those calls are funneled trough the associated EntitySqlDaoWrapperInvocationHandler, and this is where the timing associated to the DAO (and DAO_DETAILS) mask is taken.

Note that the profiling time reported by the API mask — since it supersedes all dao calls — will include everything described above. Therefore it is possible to see an API that would only make a dao call take slightly longer than the dao call itself.

The DAO mask will take the total time it takes to run the invocation handler invoke method. Because there are several paths, we also include the option to return DAO_DETAILS:

  • In most cases, we will have a profiling entry DAO_DETAILS:<method> (raw) that only profiles the call requested.

  • In the case of an insert, update, deletion, the code will also update audit logs, and history table. In this case, we will also see a profiling entry DAO_DETAILS:<method> (history/audit) that will return the time it took to create the additional records in these tables.

  • In the case of a cacheable query (e.g return a record_id associated to an object uuid), we may return the result from the cache and so this is the only case where DAO_DETAILS:<method>:raw will not be printed.

  • In all other cases, the DAO_DETAILS:<method> (raw) should match the DAO:<method> time.

GLOCK Profiling

The GLOCK mask will simply output the time it took to grab the database lock associated with a given Account — the only kind of global lock currently supported in Kill Bill.

PLUGIN Profiling

This mask can be used to profile the time it takes for Kill Bill to make calls to a given plugin using a given plugin api. Here again, such plugin calls are being proxied using the Proxy mechanism, and the timing is taken from the invocation handler.

The information will contain the name of the plugin api class and the method being invoked.

Example

Kill Bill will return a json object containing the hierarchy for the calls along with the time spent in uSec (1000 nanoseconds or 0.000001 seconds).

Example (Note the piping to extract the response header, and format the json nicely):

curl -v \
     -u admin:password \
     -H "Accept: application/json" \
     -H "X-Killbill-ApiKey: bob" \
     -H "X-Killbill-ApiSecret: lazar" \
     -H 'X-Killbill-Profiling-Req: JAXRS,API,DAO,DAO_DETAILS,DAO_CONNECTION,GLOCK' \
     "http://127.0.0.1:8080/1.0/kb/paymentMethods/3dd4e9d3-2be3-4bf2-b1e4-64a1785afd53" 2>&1 \
     | grep 'X-Killbill-Profiling-Resp' \
     | awk '{$1=$2=""; print $0}' \
     | jq '.'

Response:

{
  "rawData": [
    {
      "name": "JAXRS:1.0/kb/paymentMethods/3dd4e9d3-2be3-4bf2-b1e4-64a1785afd53",
      "durationUsec": 7654,
      "calls": [
        {
          "name": "API:getPaymentMethodById",
          "durationUsec": 4158,
          "calls": [
            {
              "name": "DAO_CONNECTION:get",
              "durationUsec": 60
            },
            {
              "name": "DAO:PaymentMethodSqlDao: getById",
              "durationUsec": 1371,
              "calls": [
                {
                  "name": "DAO_DETAILS:PaymentMethodSqlDao (raw): getById",
                  "durationUsec": 1364
                }
              ]
            }
          ]
        },
        {
          "name": "API:getAccountById",
          "durationUsec": 2593,
          "calls": [
            {
              "name": "API:getAccountById",
              "durationUsec": 2462,
              "calls": [
                {
                  "name": "API:getAccountByRecordIdInternal",
                  "durationUsec": 2408,
                  "calls": [
                    {
                      "name": "DAO_CONNECTION:get",
                      "durationUsec": 40
                    },
                    {
                      "name": "DAO:AccountSqlDao: getByRecordId",
                      "durationUsec": 877,
                      "calls": [
                        {
                          "name": "DAO_DETAILS:AccountSqlDao (raw): getByRecordId",
                          "durationUsec": 872
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        },
        {
          "name": "API:getAccountAuditLogs",
          "durationUsec": 6
        }
      ]
    }
  ]
}

We can see that for each mask there is some profiling data and such data retains the struture of the calls being made.

Client Side

The Ruby client library (https://github.com/killbill/killbill-client-ruby) has been extended to support profiling data. By passing a additional options the library will send the correct header to Kill Bill and capture the resulting information.

Currently, only JAXRS profilingData will be requested and returned for simplification.

In the example below per_thread_profiling_data will contain the timing for the various calls, which in that scenario would be post:/1.0/kb/payments/uuid and get:/1.0/kb/payments/uuid:. The provided hash will contain one key per call and the values will be an array of timing data expressed in uSec:

# Add the :profilingData option
per_thread_profiling_data = {}
options = {:username => 'admin', :password => 'password', :profilingData => per_thread_profiling_data}
res = auth_capture_task.op_create_capture(payment_id, ext_key, amount, currency, username, options)

Note that Ruby client gem will not make any attempt to synchronize access the the profiling_data map, when multiple threads are making calls. It is recommended to use per-thread data and potentially merge the results from the different threads at the end.

Tips and tricks

  • In most cases, the client will be a bottleneck. Make sure to watch the number of sockets in TIME_WAIT (watch 'netstat -an | grep 8080 | grep WAIT | wc -l') and speed up the recycling process:

echo 2 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 15000 65000 > /proc/sys/net/ipv4/ip_local_port_range
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
  • On the server side, check the expected concurrency level by watching the number of sockets in ESTABLISHED (watch 'netstat -an | grep 8080 | grep EST | wc -l')

  • Make sure to allocate enough database (org.killbill.dao.maxActive / org.killbill.billing.osgi.dao.maxActive) and plugin (org.killbill.payment.plugin.threads.nb) threads. Check your container thread pool too (e.g. conf/server.xml for Tomcat)

  • Use tools like Siege (http://www.joedog.org/siege-home/) to verify your basic setup: siege -b -t30S -c100 http://127.0.0.1:8080/1.0/kb/test/clock should yield at least 5k req./s.

  • If Shiro is spending too many CPU cycles for authentication, lower the default number of iterations (e.g. org.killbill.security.shiroNbHashIterations=2000).

  • When using YourKit, turn off probes (especially the database ones). They cause a significant slowdown.

  • Allow a warm-up period, before starting a full test, to avoid contention in the JRuby JIT.

  • Use mjprof to extract stacktraces:

java -jar target/mjprof-1.0.jar jmx/127.0.0.1:8989/.ncontains/name,RMI TCP/.ncontains/name,RMI Reaper/.ncontains/name,RMI RenewClean/.ncontains/name,RMI Scheduler/.ncontains/name,jruby-restarter/.ncontains/name,com.google.inject.internal.util.$Finalizer/.ncontains/name,Finalizer/.ncontains/name,Reference/.ncontains/name,FelixStartLevel/.ncontains/name,FelixDispatchQueue/.ncontains/name,http-nio-8080/.ncontains/name,Abandoned connection cleanup thread/.ncontains/name,CM Event Dispatcher/.ncontains/name,CM Configuration Updater/.ncontains/name,SCR Component Actor/.ncontains/name,Timer-/.ncontains/name,telnetconsole.Listener/.ncontains/name,O worker /.ncontains/name,O boss /.ncontains/name,NioBlockingSelector/.ncontains/name,Signal Dispatcher/.ncontains/name,main/.ncontains/name,JMX server connection timeout/.ncontains/state,WAITING/.sort/state/

Load Tests

Integration Tests

The integration tests repo allows to run load tests against a running instance of Kill Bill. The README has a section explaining how they work.

Note: It is difficult to rely on single process MT threaded Ruby script to generate enough load.

GNU parallel: simple load testing script

Another way to generate load is to rely on GNU parallel:

seq 0 500 | parallel -j10 --no-notice -u ./perf_test.sh

Where perf_test.sh is:

iteration_id=$1

echo "===> Starting $iteration_id"

account_id=$(curl -v \
                  -X POST \
                  -u admin:password \
                  -H 'Content-Type: application/json' \
                  -H 'X-Killbill-ApiKey: bob' \
                  -H 'X-Killbill-ApiSecret: lazar' \
                  -H 'X-Killbill-CreatedBy: creator' \
                  --data-binary "{\"name\":\"john\",\"email\":\"profiling@example.com\",\"externalKey\":\"perf-$RANDOM-$RANDOM-$RANDOM\",\"currency\":\"USD\"}" \
                  http://127.0.0.1:8080/1.0/kb/accounts 2>&1 | grep Location | awk '{print $3}' | awk -F'/' '{print $7}')

account_id=$(tr -dc '[[:print:]]' <<< "$account_id")

curl  \
     -X POST \
     -u admin:password \
     -H 'Content-Type: application/json' \
     -H 'X-Killbill-ApiKey: bob' \
     -H 'X-Killbill-ApiSecret: lazar' \
     -H 'X-Killbill-CreatedBy: creator' \
     --data-binary '{"pluginName":"YOUR-PLUGIN","pluginInfo":{"properties":[{"key":"type","value":"CreditCard","isUpdatable":false},{"key":"ccType","value":"visa","isUpdatable":false},{"key":"ccName","value":"A Smith","isUpdatable":false},{"key":"email","value":"foo@bar.com","isUpdatable":false},{"key":"ccExpirationMonth","value":"03","isUpdatable":false},{"key":"ccExpirationYear","value":"2016","isUpdatable":false},{"key":"ccVerificationValue","value":"222","isUpdatable":false},{"key":"address1","value":"lskdjf","isUpdatable":false},{"key":"address2","value":"","isUpdatable":false},{"key":"city","value":"sdfsdfsff","isUpdatable":false},{"key":"ccFirstName","value":"sdfsdf","isUpdatable":false},{"key":"ccLastName","value":"fdsfdsf","isUpdatable":false},{"key":"zip","value":"23812","isUpdatable":false},{"key":"country","value":"USA","isUpdatable":false},{"key":"state","value":"CA","isUpdatable":false},{"key":"ccNumber","value":"4111111111111111","isUpdatable":false}]}}' \
     "http://127.0.0.1:8080/1.0/kb/accounts/${account_id}/paymentMethods?isDefault=true" > /dev/null 2>&1


curl  \
     -X POST \
     -u admin:password \
     -H 'Content-Type: application/json' \
     -H 'X-Killbill-ApiKey: bob' \
     -H 'X-Killbill-ApiSecret: lazar' \
     -H 'X-Killbill-CreatedBy: creator' \
     --data-binary '{"transactionType":"AUTHORIZE","amount":"10","currency":"USD"}' \
     "http://127.0.0.1:8080/1.0/kb/accounts/${account_id}/payments" > /dev/null 2>&1

echo "===> Finished $iteration_id"

Note that on a single machine, this client-side script will most likely be the bottleneck (spawning the processes takes too much time). Use JMeter instead.

Seeking help

If all else fail, reach out to our mailing-list for help (do not open a GitHub issue).

In your message, specify:

  • What you are seeing and what you are expecting

  • How to reproduce your scenario (cURL commands, code snippet, …​)

  • kpm diagnostic output (see KPM Diagnostic Usage)