Do you have permission to (re)use that data?

You’ve used a key dataset – and/or graphic representations (tables, graphs etc) of a dataset – to inform your research and you want to publish it/them in your thesis.

The creator has made them publically available…but just because something can be downloaded from a website, doesn’t necessarily mean you can reproduce it in your thesis.

The appearance of data in different forms adds to the confusion, for example:

  • Students sometimes assume that GNS Science content is re-usable because GeoNet content is licensed under a CC 3.0 license
  • But material on the GNS Science website itself, is copyrighted while
  • The data is CC-BY licensed.

I found three really handy flow charts on the Australian National Data Service (ANDS) which help you navigate the licensing labyrinth whether you are a data creator, supplier or user. They’re linked on the Massey University Libraries Ownership and Access to your Data information page (look under the Licensing sub-heading).

The questions you’ll read in the Data User’s Flowchart  (for example) are colour coded:

  • blue for licensing queries
  • orange indicates that caution is required 
  • red indicates concern or the need for legal advice, and
  • green indicates you can complete the process.

The most important message from this flowchart (and all of them, really) is: if in doubt, ask the data owner if you can publish the data, or seek a legal opinion.

Posted in Research Data Management | Tagged | Leave a comment

‘Data breaches – the new oil spills’: protecting research participants and their sensitive personal data

“If data is the new oil, then data breaches are the new oil spills”. This observation by blogger Frith Tweedie comes hot on the heels of news reports that the Ministry of Culture and Heritage mistakenly exposed the personal details of hundreds of young people online. The sensitive data, belonging mostly to teenagers, had been uploaded to an external website without adequate protection measures. And last week, I read another news report that the results of online tests for depression taken by members of the public through the Depression.org website have been exposed to third party companies.

Digital and online technologies (e.g. smart phones, online survey tools, social media, collaboration platforms etc.) are mainstream tools that are increasingly used to collect, store and share research data. For all their use in the research lifecycle, they each have their limitations. Arguably many of the ‘digital data spills’ stories appearing in the news are down to data collectors not fully understanding the limitations of the technologies they use, or taking adequate steps to mitigate risks to the data and research participants.

If you’re a researcher who needs to store sensitive data and move it between systems, what can you do?

Firstly, make sure you don’t access or disclose sensitive personal information without prior agreement with your research participants.

Beyond this, put a a security management plan in place that safeguards data in digital form when it’s ‘at rest’ (e.g. in storage) and in transit (e.g. being moved between systems and devices).

A couple of key steps in security management planning include:

  1. Carrying out a risk assessment to classify the sensitivity levels of the data you will collect. The Privacy Commissioner has a Privacy Impact Assessment Toolkit to help with this.
  2. Evaluate the privacy limitations of the digital and online technologies you’ll use to collect, store and manage sensitive data. Have a look at the ‘terms of service’ which should detail how data will be used by the vendor and/or shared by third parties.

Learn more

 

Posted in Research Data Management | Leave a comment

New NZ Data and Statistics Legislation is in the Pipeline

A number of things happened in 1975.  Microsoft was founded (or Micro-soft as it was apparently then known), the mood ring was invented, the movie Jaws was released, Robert Muldoon became New Zealand’s Prime Minister and the Statistics Act 1975 came into law.

While this piece of legislation has been modified over the years, it is essentially legislation for last century when statistics were produced in hardcopy, and issues of privacy were very different from today. In fact, the current legislation doesn’t even refer to data. It is no longer fit for purpose, and is being reviewed.

The aims of the updated data and statistics legislation is for it to be:

  • Ahead of the game: modern and future focused
  • Treaty partnership reflected
  • Data recognised as a strategic and precious asset
  • Safeguards and protections in place.

As the consultation process moves forward, it will be interesting to see how the new data and statistics legislation applies to publically-funded data collected and used by the tertiary education sector – both for research and core business purposes. The tertiary and research sectors will have input into the new Act’s development. You can find out more about the work here.

The current legislation dates from a bygone era.
CC BY 2.0

 

Posted in Legislation | Leave a comment

Free data visualisation training

Being able to tell the story that lies in data in an engaging and visual way is an important skill that helps a much wider (and non-expert) audience understand the insights made in research. Open Data NZ are planning to run some free data visualisation workshops in Wellington (June 8-9) and in Dunedin (June 18-19).

Click on a link in your preferred location to register. These free workshops have been held in the past and are hugely popular, so get in quick.

 

Posted in eResearch tools, Events | Leave a comment

Data Classification Framework – a traffic light system for understanding data at Massey

Protecting data has been at the forefront of my conversations recently.  Why?  The Data Management Policy, sits in my care.  It promises that in order to manage our data as an asset, that sources of data need to be identified, classified and assigned a Data Custodian.  Wow I thought!  It makes sense – in order to look after data, we need to know what we have, who is responsible for it, and how they are protecting it.  But this ocean is quite big I thought.  How do I start the conversation?  Answer – you just start!

Together with representatives with a keen interest in ensuring our data is protected, we started a conversation – together.  We talked about, the need to classify data.  What the benefits might be, what the implications might look like, to both you and the University.  Together we are crafting our first draft of a Data Classification Framework – classifying our data, based on its level of sensitivity.

The framework is still in its infancy.  It has its own journey to take – but, we have started.  Shortly we will be reaching out to a wider University community to take this conversation broader, to ensure the views of the enterprise are considered.

Conversation 1 – done, tick.

Conversation 2, 3, 4 – to be continued…….

June Wirihana, Data Analyst

Posted in data classification, Research Data Management | Tagged , | Leave a comment

Data delivery solutions designed for Massey Genome Service

 A technician from the Massey University Genome Service got in touch with ITS recently asking for help to provide a solution for the delivery of results from its sequencing service to its customer. Based on the requirements provided by the technician, the ITS Research Technology Team applied for a 1TB CloudStor group drive from AARNET (a high-speed network and collaboration services provider).

Two months later, a trial 1TB CloudStor was created. The Genome Service technician gave feedback that this service worked very well for what was required. The service is currently free for this project; however, it is likely to become a paid service at some point.

As an alternative, ITS Research Technology has encouraged the Massey Genome Service to use Microsoft SharePoint. It is a different user experience to CloudStor, but delivers the same information. There is no charge to use SharePoint in this way, which means no additional charges will be passed on to the Service’s clients.

Got a question for the ITS Research Technology Team about and eResearch tools you need, or about this case study? Log a job with AskUS Self-Service – remember to write ‘eResearch’ in the subject line.

 

 

Posted in eResearch tools, eResearch Use cases | Leave a comment

Free!! Software Carpentry Workshop @ Massey University Auckland (14-15 February)

Register now to secure your place at a free-introductory software carpentry workshop at the Albany Campus. Check out the Massey Events website to find out what’s on offer. No prior knowledge is required.

Posted in eResearch tools, Events | Tagged | Leave a comment

Have your say: High Performance Computing at Massey

Do you use High Performance Computing (HPC) for your research? This is an opportunity for you to inform a roadmap for the delivery of HPC for Massey University.

Information-gathering workshops are being held at each campus during February. HPC generally refers to the practice of aggregating computing power in a way that delivers higher performance, as opposed to using a single resource, to solve large-scale problems and efficiently process demanding workloads in science, engineering, media production, and business.

Help us to understand your College’s HPC requirements. These workshops are open to Massey students and staff so please forward this invite to interested people.

Background

During 2016/17 a collaborative University-wide Research Data Management Working Group was formed to develop an institutional direction on how research data and information will be managed and retained as institutional assets.  The group was tasked with engaging with academics to appreciate the challenges they face and this information would be used to prioritise improvement initiatives, direct investment and discover areas of risk.  During this time a survey was sent to all College staff with follow up interviews with staff who agreed to be interviewed following the survey.  One of the areas of concern identified through the survey and interviewing was HPC.  Key HPC issues identified were:

  • issues with accessing compute resources in a timely manner
  • insufficient compute capacity with some Colleges having inadequate access to infrastructure
  • data on Massey owned HPC cluster is not backed up
  • Massey owned HPC does not have ongoing maintenance budget.

To respond to this a HPC roadmap is being developed that will consider the existing HPC capabilities within Massey University and make recommendations for providing a high quality HPC ecosystem that supports research which is fit for purpose across the University; including internal and external options such as Simurg, NESI, other research institutes, cloud offerings.  Your input is important for informing the recommendations.

The workshop will be 1.5 hours long and will focus on:

  • existing HPC infrastructure being used
  • HPC requirements (includes a discussion on the workload patterns, frequency, deadlines, support etc.)
  • HPC issues and concerns with current options (or lack thereof)
  • future options for HPC.
  • Workshops dates and locations:
  • Wellington, Thursday 1st February, 1pm, Scrum in Te Whare Pukaka, Block 1
  • Palmerston North, Thursday 8th February, 1:30pm, GLB2.03
  • Wellington, Friday 16th February, 10:30am, Scrum in Te Whare Pukaka, Block 1
  • Albany, Tuesday 20th February, 1:30pm, OR6
  • Albany, Wednesday 21st February, 9:30am, AT5
  • Palmerston North, Friday 23rd February, 9:30am, GLB2.03

If you have any questions or are unable to attend a workshop and want to contribute don’t hesitate to get in touch with Sarah Ellison or Patrick Rynhart, p.rynhart@massey.ac.nz, to discuss further.

RSVP to Sarah Ellison, s.ellison@massey.ac.nz, at least two days before the workshop with which workshop you are attending to ensure enough biscuits are available and there is sufficient capacity in the rooms.

Posted in eResearch tools | Tagged | Leave a comment

Jurisdictional risks of cloud storage

Now that information has been gathered from researchers about what they need to manage their research data better, work has now begun to address the gaps raised.

We’re going to actively work on intellectual property and jurisdictional risks to data stored in the cloud in the last quarter of 2017. This refers to the concept that information which has been converted and stored in binary form (on the cloud) is subject to the laws of the country in which it is located: and sometimes that carries risks. The Government has developed resources on assessing the risks of cloud services if you want to read more about it.

In other news, June Wirihana has joined ITS as Data Management Specialist.

Posted in data sovereignty, eResearch tools, Jurisdictional risk | Leave a comment

Research Data Management Initiative

The University is working to improve support and services for Massey University researchers in regard to the management of research data.

A working group with cross-organisational membership (Research and Enterprise, ITS, the Library and academic representatives) was established in 2015 to investigate the problems being experienced, and to work out a plan for improving the situation.

The investigations

The Office of the Assistant Vice-Chancellor Research, Academic and Enterprise issued a survey to Research Staff, and a series of interviews with people who agreed to be contacted were carried out. The investigations aimed to uncover:

  • Current practices in relation to the creation, use, storage and preservation of research and teaching data including collaboration and sharing.
  • The aspects of research data management requiring additional support and services.

What was found?

Some clear themes emerged from the investigations:

Data storage

  • Current storage provisions are insufficient
  • processes to get additional storage is time consuming and challenging
  • the cost charged for additional storage is difficult to fund“Portable storage is a survival technique”

Collaboration

  • Difficulties collaborating with external parties
  • large data volume needs to be shared
  • Dropbox is the most popular cloud-based tool which is not an approved platform and is inconsistently funded
  • issues with  distributing large files with students

    “I have a major collaboration project coming up later this year. I have a…process in place but I don’t know if it will work in a year’s time”

 

  • High performance computing
  • Issues with accessing compute resources in a timely manner
  • Insufficient compute capacity
  • data on Massey owned HPC cluster is not backed up
  • Massey owned HPC does not have ongoing maintenance budget

    “I have had to buy my own laptop with the processing capacity I need”

Support and advice

  • More support and advice required for managing data, both structured and unstructured; including when to make use of personal databases for storing and managing data
  • advice required in several different formats, such as web page, one-on-one, fact sheet
  • unclear on intellectual property and rights management

    “Everyone has reinvented the wheel for themselves”

Data protection and rights

  • Risks associated with data stored on cloud-based collaboration tools, such as data sovereignty, data loss if staff member leaves Massey University, accessibility and security
  • risks associated with data stored on external hard drives, such as data is not backed up, data is often taken when staff leaves Massey University, and security
  • difficulties with intellectual property and managing rights to research data, particularly when collaborating with external parties

    “There is little advice provided by Massey on ownership of data and how to protect it. I’m nervous about sharing in collaborative environments…”

Next steps

A programme of work to address the gaps raised by researchers is being developed. Some initiatives are ‘quick wins’ and these are currently being worked through. Others involve larger pieces of work that will be incorporated into a ‘roadmap’. The table below lists the initiatives (please note that additional issues raised that are not listed in this table will also be addressed).

Type Item How Who
Quick win Database management advice Paper written presenting options with consultancy available ITS
Quick win Data protection and ownership advice Content written and added to the library research data management pages Library
Quick win Email storage increase Project started to move email to Office 365 which increases mailboxes to 50GB ITS
Quick win Storage options decision tree Once the options and recommendation is complete create a decision tree and make it available on research data management web pages plus communicate with all staff ITS and Library
Quick win GitHub advice Provide guidance for creating and managing a GitHub repository (Colleges would be responsible for funding their GitHub repositories) ITS
Quick win Better communication and visibility for services For example how many academics are aware of ZendTo as an option ITS
Roadmap Cloud storage for collaboration Options analysis (including OneDrive, Dropbox or CloudStor) with a recommendation ITS
Roadmap Storage roadmap Review current options and investment implementations ITS
Roadmap Data classification and backup options Review the approach for backing up data (both within ITS and college based infrastructure) to see if we can leverage a data classification to reduce the costs or extend to other data without increasing the costs ITS
Roadmap Process for making purchases through ITS Review whether recent changes are sufficient for providing an easy and time effective process for academic staff to make hardware purchases ITS
Roadmap eResearch offering This will support short term, on demand, scalable HPC as required and provide consultancy support for ITS research support ITS
Roadmap High performance computing decision tree Once the options, recommendation and approach is agreed then a decision tree is made available on research data management web pages on the options and how to access these options and communicated to relevant staff, for example CoCA, INMS ITS, TEAB and Library

 

Further information

Contact RDM Working Group Convenor Sarah Ellison S.Ellison@massey.ac.nz, or Natalie Dewson n.m.dewson@massey.ac.nz if you need further information.

Posted in Research Data Management | Tagged | Leave a comment