span.fullpost {display:inline;}

Tuesday, May 19, 2009

Getting Started

After being sick all last week, I feel like I'm just getting started on this summer's research, and that I'm very behind! At the moment I'm reading up on papers related to social networking tools for programmers, as it looks like I'll be collaborating with Sarah for the time being. We want to create a tool that will use email and repository logs of software development projects to identify "experts" (within the development team) that are most knowledgeable about the different sections of code. Such a tool would be useful for identifying who you need to talk to before making changes to certain areas of code, and also for discovering people that perhaps should be included in your social network, but that you are not currently in contact with.

I've done a lot of reading today to try to get a grasp of what tools are out there already, what problems exist, and what we could do differently.

[Who Should Fix This Bug?]
Brought up some things we might want to think about:
  • Is the person who check the code into the repository the same person who made the changes to the code? (i.e. if it needs to get approved before being checked in, is the "approver" the person who checks it in, or the person who originally changed the code?)
  • How are we going to decide which features of the emails / repository logs / code constitute meaningful data that we should be using to construct our network? Is there someone on campus we could talk to about this? I don't feel comfortable "guessing" how to construct a meaningful network.
  • The paper mentions a tool called the "Expertise Browser" that sounds like it does exactly what we're trying to do -- "uses source code change data from a version control system to determine experts for given elements of a software project". I'm going to look into this!
[Expertise Browser: A Quantitative Approach to Identifying Expertise]
Expertise Browser is a tool that uses repository change data to determine who should be considered "experts" on specific sections of code, and also who the experts are for certain programming languages, overall releases of the software, and certain types of bugs.
  • The browser uses simple deletes/additions to the code as a unit to measure expertise. (Who did the change, how big was it, etc.) I originally thought that an approach this simplistic would be unable to give us the results we want, but the paper seems to have evidence to back this up as a valid method of determining expertise.
  • Notes that having individuals make explicit declarations of their areas of expertise would not be practical, as it would require constant updating
Also mentioned some added functionality that we may want to include in our tool:
  • once a user has identified the expert they were looking for, we could allow them access to that person's contact information
  • give users the ability to "watch" desired areas of code by informing them when changes have been made to said areas
[Tesseract]
An existing tool that compares the social network between programmers with the technical network between code components. Based on research that showed that work proceeds more efficiently when patterns of communication between workers match the logical dependencies of the code. (See next paper I talk about)
  • files are considered related if they are frequently changed together (degree of relatedness illustrated by thickness of line in the visualizer)
  • people are considered related if they have edited the same code, or have edited related sections of code
  • uses different coloured links to illustrate whether or not two related people also have a social relationship (i.e. a green line between people who have emailed each other, with line thickness representing amount of communication)
Again, I'm surprised by the simplicity of the techniques used to construct the networks. This is definitely starting to seem like a project we could do this summer. I guess the goal would be to make the interface a lot simpler than that of the existing tools, and possibly tailor it specifically for the people at the Hadley center (if we get access to their project data, I imagine we might discover certain nuances in their practices that we could use to construct more accurate networks of their data, specifically).

[Design of Collaboration and Awareness Tools]

  • study that shows that projects have improved efficiency when the social network between programmers matches the technical network between code components
  • suggests that it would be useful for awareness tools that alert project managers when communication between people is lacking
  • suggests the display of a "dynamic buddy list" that will give a list of people most relevant to the code that you are working on

0 Comments:

Post a Comment

<< Home