Wikipedia:Reference desk/Archives/Computing/2018 February 7

Computing desk
< February 6 << Jan | February | Mar >> February 8 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


February 7

edit

Getting statistics from Apache Subversion

edit

Is there a way to find out what programmers actually do? I'd like to see who doesn't submit much anyway, who's submitting but whose code gets changed soon afterwards by someone else, etc. There's not much I can find. Joepnl (talk) 01:24, 7 February 2018 (UTC)[reply]

The usual approach goes a bit further than this. svn will tell you who did what, but it won't tell you why. So it's common practice to link this to your issue tracking system, by embedding the ID for the issue tracker into the svn commit comment as "[1234] Fixed the exploding foobar problem". You can then use svn log on a build tag to get a changelog between builds. Use some scripting to process all that up, link it to your issue tracker, and you can slice and dice it every which way. Some people do this with external scripts, other push it into a database (sometimes permanently, better just doing it temporarily) and then query across both the issue tracker and changelog databases.
You'll also need a commit script which checks that the coders have given a validly formatted issue ID on each commit, or else they get an electric shock and no food pellet. Andy Dingley (talk) 03:04, 7 February 2018 (UTC)[reply]
I wish I thought about the electric shock before! I don't really care why someone's code got reverted or changed afterwards. One change may or may not have been their fault. After 1000 submits in a year, of which 800 were reverted or changed within a month, I think that's a nice metric to seperate the men from the boys. I'm looking for a tool that can show just that. There's certainly more to that than a simple script. A diff between two files isn't easy. A diff over 50 versions showing whose code was actually changed is something that needs a tool that must exist but I haven't found. Joepnl (talk) 04:15, 7 February 2018 (UTC)[reply]
The usual algorithm is to use svn log to generate a set of issue/bug IDs which were "affected" by commits between those two build tags. Then to link the current status of those issues. That's two linear lists, not hard to do.
If you want to look at history per issue, then you need to produce a list-of-lists of the commits involved for each issue (scanning the log comments - a commit may involve multiple issues). You might then filter that to particular issue(s). You now have a list of commit IDs. svn can then give you a list of changed files for those, and you can then use svn to get the file histories (between the build tags) for each file. The only hard bit are the file content changes, and svn diff does the legwork for those. Andy Dingley (talk) 11:01, 7 February 2018 (UTC)[reply]