Read this guide if
This guide assumes that you:
Piwik is an application that does mainly two things:
To achieve that result, several parts of Piwik come into play:
Piwik's codebase is composed of:
Plugins are not just targeted at 3rd party developers who want to customize Piwik: most of Piwik is implemented through plugins. Piwik Core is meant to be as small as possible.
As a result, there are two kinds of plugins:
plugins/folder) or through Piwik's MarketPlace in the web interface
Here are the main files and folders composing Piwik's codebase:
Piwik uses Composer to install its dependencies (PHP libraries) into the
The entry point for the web application is
index.php in the root. This files initializes everything and calls the
The front controller will route an incoming HTTP request to a plugin controller based on URL parameters:
In this example, the front controller will call the action
index on the controller of the
Plugin controllers return a view (usually HTML content) which is sent in the HTTP response.
A part of Piwik's long-term roadmap is to move more and more parts of Piwik's UI to AngularJS.
Read more about this in the "Working with Piwik's UI" guide.
The HTTP reporting API works similarly to the web application. Its role is to serve reports in machine-readable formats (XML, JSON, …).
It has the same entry point and is also dispatched by the front controller.
This HTTP request will be processed like any other call to a controller: the plugin name is
API and no
action is given, which will fall back to
Piwik\Plugin\API\Controller class will be called, and it will dispatch the call to the targeted API, acting as a second front controller for API calls. In our example,
SEO.getRank means that the
Piwik\Plugin\SEO\API::getRank() method will be called.
Its entry point is different from Piwik's web application and HTTP reporting API: it is through the
Read more about this in the "The Tracking HTTP API" reference.
Piwik offers a command line API through the
./console script. This script uses the Symfony Console component.
Plugins can expose CLI commands that can be invoked like this:
Command classes are located in
plugins/*/Commands and are auto-detected by Piwik.
Read more about this in the "Piwik on the Command Line" guide.
Piwik lets you collect analytics data to then later retrieve as reports. Let's see what happens in-between and how Piwik models, processes and stores data.
The HTTP tracking API (i.e. the
Piwik\Tracker component) receives raw analytics data, which we call "Log data".
Log data is represented in PHP as
Piwik\Tracker\Visit objects, and is stored into the following tables:
log_visitcontains one entry per visit (returning visitor)
log_actioncontains all the type of actions possible on the website (e.g. unique URLs, page titles, download URLs…)
log_link_visit_actioncontains one entry per action of a visitor (page view, …)
log_conversioncontains conversions (actions that match goals) that happen during a visit
log_conversion_itemcontains e-commerce conversion items
Those tables are designed and optimized for fast insertions, as the tracking API needs to be as fast as possible in order to handle websites with heavy traffic.
The content of those tables (and their related PHP entities) is explained in more details in the "Piwik database schema" guide.
The tables above are not designed or optimized for extracting high-level reports: aggregating the log entries to the day, week or month can become too intensive when there is a lot of data.
The archiving process will read Log data and aggregate it to produce "Archive data". Data is aggregated and stored for each:
Archive data can be:
numeric metrics: simple numeric values (like the number of page views)
These are stored in the
archive_numeric_* tables. Values are stored as float.
table records: bidimensional data (can be numeric values as well as anything else), represented as
These are stored in the
DataTable objects are serialized to string and compressed to be stored as
BLOB in the table.
DataTable objects stored in the database are named records to differentiate them from
DataTable objects manipulated and returned by Piwik's API that we name reports.
Every numeric metric or table record is processed and stored at each aggregation level: day, week and month. For example, that means that the "Entry pages" report is processed and stored for every day of the month as well as for every week, month, year and custom date range. Such data is redundant, but that is essential to guarantee fast performances.
Because Archive data must be fast to query, it is splitted in separate tables per month. We will then have:
archive_numeric_2014_10: metrics for October 2014
archive_blob_2014_10: reports for October 2014
archive_numeric_2014_11: metrics for November 2014
archive_blob_2014_11: reports for November 2014
As shown above, data is stored either as numeric metrics or table records.
DataTable objects and are served by the API classes defined by plugins. API classes access persisted metrics or records and transform them into presentable reports.
Sometimes, one persisted record can be the source of several API reports.
You can read more details on how reports are created and served in the "Reports" guide.
Piwik Core only defines the main processes and behaviors. Plugins can extend and customize them through several extensibility points:
You can read more about this topic in the "Piwik's Extensibility Points" guide.