Anonymity of the Results

tristinautrey · June 12, 2019, 5:00pm

Good morning,

I have created several surveys and either gathered / generated data in order to practice with the network analysis tool. However, just curious at how to anonymize the data once it is gathered? Or is there a way to do this on the front end before the survey is sent out? Another way to ask this is, “is there a way to toggle between anonymized and non-anonymized data?”

apitts · June 13, 2019, 4:50am

Generally the way that this is done is to first generate the networks from the survey in the normal way. Then, you can go to explore for each network and export to Excel. Once the data is in Excel you can use INDEX and MATCH in Excel to replace the name / id column with something like “Respondent 1”, “Respondent 2”, etc and then also replace the Source and Target columns with the anonymized respondent name. You will also want to give some thought to remove or aggregating any attribute data that would otherwise make a respondent identifiable. Once done, save the Excel file and then either upload it as a new network or update the existing network.

A few users have indicated that they would appreciate the ability to generate anonymised networks by toggling an option at the point of generating the network in Polinode. A feature request along those lines would of course be welcome - feel free to post it under Feature Requests.

tristinautrey · June 13, 2019, 5:57pm

Thanks for the information. Quick question, I tried a couple of different ways to do this, couldn’t get it done with INDEX and MATCH. But I did simply replace the Name and Label columns with Respondent 1, etc. and it worked. Is there a reason why I shouldn’t do this?

apitts · June 14, 2019, 1:31am

Yes, this will work, i.e. it’s possible to just replace the Name and Label columns when using networks that are created from surveys because those networks use the id column for the source and target columns in the edges worksheet. Ideally though you would actually replace this id column in the nodes worksheet with something else and then do a lookup in the source and target column to replace those ids with the new anonymized value. The reason being that the id’s used are the id’s that Polinode uses for respondents in the survey. If you are just screen sharing or similar this may not be an issue…but if you are actually providing access to the anonymized network then this isn’t ideal / best practice. The user with access is unlikely to have access to those Polinode id’s but it would be better to truly anonymize regardless.

soconnor · August 5, 2019, 12:49pm

Maybe im not reading something right here but as i understand it the quickest way to do this is just replace the names in the attribute column under name with numbers or respondant 1 etc is that right?
IM not sure what is being referred to as the “label” column?

apitts · August 6, 2019, 1:42am

@soconnor, it sounds like you may be using the raw survey export rather than exported individual networks (i.e. post generation) here. You can still anonymise the data from the raw survey export but will need to anonymise the name column as well as the source and target column in each relationship tab in this situation. The quickest way to do this is to add a new Name column to the nodes worksheet with the anonymised name and then use INDEX(MATCH()) in each relationship question tab to replace the original source and target with the anonymised equivalent. You would then paste as values and delete the original name, source and target columns before uploading the anonymised relationship questions as networks.