Part 1 in a series of articles exploring real-time and historical analytics for Node.js, covering analytics options currently available in the Node.js ecosystem. I think there's a need for better analytics in the Node.js ecosystem to support engineers and business folks. I'm researching and writing to stimulate feedback on this, and to explore what if anything needs to be developed in terms of open source modules, APIs, etc.
Let's go through what's available currently to implement analytics in Node.js.
If you're developing with Node.js, chances are you're doing some kind of persistent logging. You might be logging critical events to make troubleshooting and development testing easier, logging outcomes of test suites like Vows, or keeping logs of requests and responses to understand how your server(s) are being used. You might be using a logging module like Nodejitsu's Winston and/or PaaS offerings designed specifically for logging like Splunk or Loggly, or may have a customized approach to logging. And hosting / deployment platforms like Heroku, Joyent, Nodejitsu, etc. often provide additional logging capabilities.
How logs are retrieved and analyzed is dictated by the method in which the logs are written. Developers analyze filesystem-based logs in an ad hoc way during development using grep or other powerful basic methods. Hosting environments may come with analysis tools, like Heroku's revamped logging and Joyent's Cloud Analytics. Cloud logging services provide APIs that are specialized for log retrieval and analysis, like Loggly's Retrieve API. Analysis tools for log data primarily focus on serving engineering needs, like troubleshooting, performance tuning, provisioning, testing, etc. with two basic functions:
The value of logs isn't limited to engineering - there's a long history of log data being used more broadly to make marketing and other business decisions - e.g. Apache logs providing an early foundation for the what would become the multi-billion dollar web analytics industry. While this industry has evolved to rely more on client-side analytics and less on server-side data like logfiles, there's an interesting case to be made for the importance of server-side approaches in Node.js applications, supported most obviously by widespread use of Node.js in API servers (e.g. serving JSON to other servers via REST) and other situations where "clients" don't support client-side analytics.
The analytics (software that helps engineers and business people understand what's happening with their product) available for Node.js is limited to the two categories above. As far as I know. If there are other options out there I'd love to hear about it.
A new category that makes sense to me is software that's designed specifically for the Node.js server with the following characteristics:
I've made a small amount of progress toward this by taking a stab at the instrumentation of a Node.js request object for analytics.
I'd love to hear feedback, corrections, etc. on my take on the state of analytics in the Node.js ecosystem and on this specific proposal for an additions to it.
Stay tuned for coverage of real-time analytics next week.blog comments powered by Disqus