Raw data from a log.
When I started working at Revo, they had a system that worked for them to get data off of their hardware, play it back (slowly), and get microphone levels for the system over a period of time. It worked well as a data gathering tool for their existing scope, and for looking at one system at a time. When it came to accessing the scan data later, or accessing support files, it was mostly done by manually parsing log files, thus limiting the impact the data could have. The data was generated by dumping out a live log system, so there were also data integrity issues and issues in tying different files together with common values to form more complex analysis.
To solve this issue, I was able to deliver custom tools to a client that showed them the data they wanted to see for their systems. I was able to use pieces of our new data collection, parsing and visualization system to quickly stitch together and then customize a tool to address any data in the system and get it in the hands of our users to address their issues.
The first step was to parse the files coming from the scan tool and the support files (below images) into something more easily used in real time. After running scans overnight, or a series of scans on different machines, it was easy to have a handful of files that needed to be analyzed. The first tool I created pulled out the important information from these files into an easily parsed csv file.
sample support file
sample scan file
scan zips from a test
one scan’s parsedData folder
UI for parsing and filtering scan files to find pertinent data
Using PyQt I put together a simple UI in order to make this process more accessible. This also gave users the chance to look at all of the files they needed to analyze at once. Working with a team to decide which were the most important factors, I expanded these filters to what you see above, making it straight forward to understand system status at a glance.
There’s 204 data points for each system’s configuration in the bottom window, all filterable to make it easy to find only systems that match what we were looking for. In short order I had helped make the data vastly more accessible and actionable. We got the tool into people’s hands, and worked out the nest sprint to keep this project agile.
The final step here was to visualize the information. We already had the mic’s audio level, but by combining it with the mic and band errors we were able to give our testers and developers a much better understanding of what the systems were doing and the cause and effect relationships going on.
After our initial success and the benefit we were able to see from better using the data, we started looking for more opportunities. While I worked to adapt the system to read bug report data, another team member was able to dramatically improve and expand the data collection tool. Shortly thereafter, we were tasked to diagnose a bug in our hardware using better data collection, parsing, and visualization.
With this method we were able to not only diagnose that issue, but found other contributing issues, documented, charted, and passed the information on to the team to correct them. Revolabs now has a much more data driven approach that will be carried forward for future projects at the company.
Adaptations to system for support files with battery focus for specific client use
My final accomplishment at Revo was helping to set up quality data sources, formats, access points and outputs for data coming from Revolab’s latest piece of hardware. With everything we learned upgrading data systems mid stream, we wanted the next piece of hardware to have a firm foundation of data collection, availability, analytical tools and visualization from the beginning.
It was a rewarding experience to go from being asked to fix a data problem, to seeing the people around me embrace the benefits of my solution for the data problem.
Unlike a lot of my other work, this was very utilitarian, building quick solutions and layering from there to produce the larger system. My initial python scripts simply cleaned up the data, getting it into organized csv files so at least the manual parsing could be done in one place instead of, for example, checking eight different mic files.
When QT was added to give the system a straightforward interface, making it easier to access files on the network and displaying the data within filterable tables, we could see even more possibilities. Charting was handled by Bokeh at first, which worked well for our internal work, but MatPlotLib was used eventually in order to be able to create stable .exe to send out.