Collecting Metrics in PyQt

dgovil · October 11, 2012, 9:38pm

I was just wondering if anyone had any ideas or solutions to collecting anonymous usage metrics from PyQt UIs?

I’ve heard people mention it before in demonstrations but never really seen an implementation example.
Is there some kind of global signal on user interaction that I can pipe to a function that will cache the metrics data (and send it to a server at regular intervals)?

I can see it being quite useful in figuring out which tools are currently the most useful and phasing out any unnecesary/unused ones.
Also in streamlining UIs for easier use.

Essentially a function that will be run whenever a UI element is directly used, figure out the objects name and add it to the cache. After every 10m or so, it will submit the data to a server.
The server can then display a Google Analytics style usage graph etc…

RobertKist · October 11, 2012, 10:26pm

we got something like that. I think you pretty much described the solution here. Either cache the data or just connect directly if your network is fast enough. That SQL stuff causes little traffic and usually people don’t do “button mashing” when using their tools
We use this to find out popular tools and then we can investigate why some tools gain acceptance and why others don’t. We also track install base - e.g. we can check if a project uses that tool, etc.

I don’t know too much - one of the guys in my team implemented it and it works just nicely.

patconnole · October 12, 2012, 12:04am

My coworker built something like that. Our main tool UI pings a sql file on the network when any buttons are pressed. There is a slight delay on the user side, but it’s not bad. It’s pretty neat to see the usage data. Much more informative than asking people, “how much do you use this tool?”

dgovil · October 12, 2012, 10:01am

Good to hear that other people are doing it with success.
Now to figure out how best to implement it.

That slight delay is something I was hoping to cut down on, and also accidently DDoSing a server (we have a crazy number of employees, though not all of them would use the tools), hence the caching and uploading at intervals

Thanks for the info too

Theodox · October 12, 2012, 11:08am

If you’re using a standard SQL server, the network load is not an issue – you’ll be doing at most a few updates per minute, and (for example) MS SQl Server is suposed to be able more than 1300 hits per second.

However you do probably want to think twice before recording every button click in detail – I seem to recall that Jeff and Adam talked about this at last years TA roundtable and they said their data was so granular that it was hard to use effectively. And you should probably pop the updates onto their own thread (or a local repeater proc on your machine) because network dropouts will cause random pauses if you just execute the sql inline in your code.

Good use for a decorator – just write once and slap it onto the event handlers you care about…

dgovil · October 12, 2012, 2:38pm

Yeah, you’re right. It would probably be too granular. Probably best done at the function/slot level.
Plus it will let me have more consistent control over what is being reported back (ie if I change names around, I won’t have fragmented data)

And definitely would have to be in it’s own thread. Our systems lock up bad enough when we have network hiccups, so it would just fail otherwise.

Temujin · October 12, 2012, 3:24pm

A simple decorator and a thread are the best way to do it.
Also tracking tool exceptions via sys.excepthook (After un-mangling Maya’s anyways…) has helped.
(For exceptions you can also send emails rather than dumping to sql)

LoneWolf · October 12, 2012, 3:42pm

I second Temujin, decorator pattern with async operation. You could buffer the data and upload it when the buffer is full and then empty it.

thirstydevil · October 13, 2012, 9:10am

I’ve not long implemented this at work. I have a module that I load PyQt from and I have my own button class, dialog class, etc in that module. So I have one place to use implement this. I’ve overridden the mouseClickEvent to fill a global with usage data. Then on file save I squirt this data to the server Db. I did do it on per click but I decided to collect and push the data at later date to save network noise. I also have a decorator that I use at function level, because when I started this I intended it to be for widgets, but then realized that most of the time my button calls are simple wrappers from our api.

dgovil · October 15, 2012, 9:00am

@thirstydevil , I’m just curious about the squirt on save. If the tool/maya crashes, I assume it loses the data? But if the parent window is closed (Maya for example), does the tool still count that as a save etc?

Theodox · October 15, 2012, 10:08am

For exception cases the global excepthook should work. However maya fatals would be lost, at least AFAICT - mayaPy doesn’t get notified of fatals, it just disappears.

It seems like the standard python to register exit handlers does not work in Maya.

dgovil · October 15, 2012, 4:41pm

Ah, well then I suppose writing to a log file with a timestamp per entry maybe best. It’d be just ascii so it’s a light file and I can prevent data duplication that way. Crash’s will then have negligible loss to any data.

I’m going to jump on this as soon as production on other bits dies down a bit.

LoneWolf · October 16, 2012, 3:15am

yeah but keep in mind YAGNI!

Rob_Galanakis · October 16, 2012, 2:30pm

It’s very easy to offload this sort of work so it doesn’t cause a UI interruption, once you get it working.

Remember the more granular your data, the more difficult it is to work with.

Tracking how many times a UI is opened or how long something takes to start up, is very simple.
Tracking every button push or menu click, you will have so much information it won’t be of any use. You will need to invest far more in the frontend than writing the data collection.

Start with something simple that will get you value and scratch the itch you need, see if it’s useful, go from there.

Theodox · October 16, 2012, 2:43pm

What he said.

And don’t sweat over the loss of data to a crash, etc – this is statistical data about usage, losing a few button clicks here or there to crashes won’t affect your overall usage patterns. Crash resistance is much more important for collecting exception / error handling data – but that’s a different problem that should be handled with an except hook and some combination of log files, email and or database stuff.

dgovil · October 16, 2012, 4:10pm

[QUOTE=Rob Galanakis;18485]It’s very easy to offload this sort of work so it doesn’t cause a UI interruption, once you get it working.

Remember the more granular your data, the more difficult it is to work with.

Tracking how many times a UI is opened or how long something takes to start up, is very simple.
Tracking every button push or menu click, you will have so much information it won’t be of any use. You will need to invest far more in the frontend than writing the data collection.

Start with something simple that will get you value and scratch the itch you need, see if it’s useful, go from there.[/QUOTE]

Oh yeah, definitely not going as granular as I first led on. The idea of recording each one was just for simplicity of capturing input and letting the server filter. But you guys are right, havign a decorator allows me to standardize more on the script side and more useful broad usage. ie I never really intended to display every button press, but I hadn’t thought of a decorator to do it at that point.

[QUOTE=Theodox;18487]What he said.

And don’t sweat over the loss of data to a crash, etc – this is statistical data about usage, losing a few button clicks here or there to crashes won’t affect your overall usage patterns. Crash resistance is much more important for collecting exception / error handling data – but that’s a different problem that should be handled with an except hook and some combination of log files, email and or database stuff.[/QUOTE]

You’re right, the information isn’t actually that important/sensitive so crash resistance isn’t imperative. I will eventually need to figure out a way to get crash reporting working though in a similar manner.