WEBVTT - generated by VCS

1
00:00:00.144 --> 00:00:03.664
How can you start with writing your data management
chapter?

2
00:00:04.164 --> 00:00:07.564
We have some suggestions. You don't have to
follow them.

3
00:00:07.824 --> 00:00:10.204
It's just a way of

4
00:00:10.444 --> 00:00:15.124
collecting all the information and what to
keep in mind while discussing this

5
00:00:15.125 --> 00:00:20.104
with your colleagues. We suggest to

6
00:00:20.584 --> 00:00:23.904
use a holistic approach to research data management,

7
00:00:23.905 --> 00:00:27.804
that reflects all the issues

8
00:00:28.404 --> 00:00:31.754
I just introduced. So don't

9
00:00:32.114 --> 00:00:34.974
only think about the IT landscape,

10
00:00:35.494 --> 00:00:39.294
without the conceptual

11
00:00:39.554 --> 00:00:41.654
measures. The IT landscape

12
00:00:41.655 --> 00:00:43.934
might not be useful.

13
00:00:44.294 --> 00:00:48.114
First, think about what you really want to
do in your project,

14
00:00:48.115 --> 00:00:50.734
what is needed for your project,

15
00:00:50.754 --> 00:00:53.054
like how closely you work together,

16
00:00:53.055 --> 00:00:55.254
what exactly is your research plan?

17
00:00:55.474 --> 00:00:58.474
What do you really need for research data handling?

18
00:00:59.174 --> 00:01:03.000
And, after that, choose the suitable IT landscape
for

19
00:01:03.000 --> 00:01:08.000
that. Don't forget to address that Somebody

20
00:01:08.000 --> 00:01:12.000
has to do the work when it comes to research
data management and that

21
00:01:12.000 --> 00:01:17.000
you need resources for that and you have to
allocate responsibilities

22
00:01:17.000 --> 00:01:22.000
for that. And also think about that you might
have

23
00:01:22.000 --> 00:01:26.000
to shape the competences and skills for research
data management and

24
00:01:26.000 --> 00:01:31.000
also generate awareness for research data management

25
00:01:31.000 --> 00:01:36.000
and also for some legal applications that come
with

26
00:01:36.000 --> 00:01:41.000
data. You might

27
00:01:41.000 --> 00:01:46.000
have already a lot of experience with research
data management because

28
00:01:46.000 --> 00:01:48.000
you have a lot of experience with research.

29
00:01:49.000 --> 00:01:53.000
So just get in a conversation with your collaborators

30
00:01:53.000 --> 00:01:58.000
about the strengths you already have as a consortium.

31
00:01:59.000 --> 00:02:03.000
What is already going well

32
00:02:03.000 --> 00:02:08.000
while working together? What already went very

33
00:02:08.000 --> 00:02:12.000
well in previous projects when it comes to
data handling? And then

34
00:02:12.000 --> 00:02:17.000
just use this and keep those measures and strategies
and describe

35
00:02:17.000 --> 00:02:21.000
those strategies in this chapter about research
data management.

36
00:02:22.000 --> 00:02:26.000
Also, you might have experiences what went
wrong when it comes to data exchange,

37
00:02:27.000 --> 00:02:32.000
for instance, when it comes to making data
understandable,

38
00:02:32.000 --> 00:02:34.000
for later,

39
00:02:34.000 --> 00:02:36.000
documented data, maybe. So,

40
00:02:37.000 --> 00:02:42.000
be frank about what went wrong before and,

41
00:02:42.000 --> 00:02:45.000
maybe, introduce some changes,

42
00:02:45.000 --> 00:02:49.000
some measures that fix those problems you ran
into,

43
00:02:49.000 --> 00:02:53.000
before. And think about what risks

44
00:02:53.000 --> 00:02:58.000
might there be, so what requirements might
not

45
00:02:58.000 --> 00:03:02.000
be met, because you are not introducing

46
00:03:02.000 --> 00:03:07.000
specific strategies and think about how you
can reduce those risks.

47
00:03:07.000 --> 00:03:12.000
Mostly, it's because no responsibilities have
been allocated

48
00:03:12.000 --> 00:03:16.000
or no resources for research data management
have been planned.

49
00:03:16.000 --> 00:03:20.000
So calculate what could go wrong in the future
and how you can

50
00:03:20.000 --> 00:03:25.000
meet those risks, mostly with planning for
some resources for

51
00:03:25.000 --> 00:03:29.000
research data management as well. And also
think about

52
00:03:29.000 --> 00:03:31.000
what changes

53
00:03:31.000 --> 00:03:35.000
you can implement and how these changes could
be

54
00:03:36.000 --> 00:03:41.000
a sustainable solution for all the partners
in the future.

55
00:03:41.000 --> 00:03:46.000
So maybe you can introduce some research data
management procedures that

56
00:03:46.000 --> 00:03:48.000
might stay for the chair, for the faculty,

57
00:03:49.000 --> 00:03:51.000
and so on.

58
00:03:51.000 --> 00:03:56.000
This could be a starting point for your conversation
with your collaborators,

59
00:03:56.000 --> 00:04:00.000
this is a so-called SWOT analysis about strengths,

60
00:04:00.000 --> 00:04:03.000
weaknesses, risks

61
00:04:03.000 --> 00:04:05.000
and threats and opportunities.

62
00:04:10.000 --> 00:04:15.000
To collect all information that might go into
a data management plan.

63
00:04:16.000 --> 00:04:20.000
So maybe for the data management plan for the
European Commission or the data

64
00:04:20.000 --> 00:04:24.000
management plan that you want to use for from
the start

65
00:04:24.000 --> 00:04:29.000
of your project, there is a

66
00:04:29.000 --> 00:04:33.000
a template that's called Research Output Management
Planning - ROMPi.

67
00:04:33.000 --> 00:04:37.000
It's published via Zenodo,

68
00:04:37.000 --> 00:04:40.000
if you want to have a look at this.

69
00:04:40.000 --> 00:04:42.000
I used this to

70
00:04:43.000 --> 00:04:48.000
build a help template for

71
00:04:48.000 --> 00:04:53.000
research data management planning for European
Commission proposals.

72
00:04:54.000 --> 00:04:57.000
Basically,

73
00:04:58.000 --> 00:05:03.000
this template uses the

74
00:05:03.000 --> 00:05:07.000
work project, the project planning, project
management with work

75
00:05:07.000 --> 00:05:12.000
packages, and analyzes from which work package
which

76
00:05:12.000 --> 00:05:16.000
research output follows. The research output
could be

77
00:05:17.000 --> 00:05:20.000
data, or processed data,

78
00:05:20.000 --> 00:05:25.000
or a method, or a protocol, or a literature
summary,

79
00:05:25.000 --> 00:05:29.000
or something like that. So, first have a look
at each work package,

80
00:05:30.000 --> 00:05:33.000
which research output might arise from those
work packages,

81
00:05:33.000 --> 00:05:38.000
and then try to come up how you want to make
this research output

82
00:05:38.000 --> 00:05:40.000
FAIR, again for you,

83
00:05:41.000 --> 00:05:45.000
for your future self, for the whole project
group,

84
00:05:45.000 --> 00:05:49.000
or for the community. And if you are doing
that

85
00:05:49.000 --> 00:05:52.000
for each relevant

86
00:05:53.000 --> 00:05:58.000
research output, in each work package where
a relevant research

87
00:05:58.000 --> 00:06:00.000
output is produced, you

88
00:06:00.000 --> 00:06:05.000
will have an overview of what measures for
research data management

89
00:06:05.000 --> 00:06:08.000
you have to implement for your project.

90
00:06:08.000 --> 00:06:13.000
This is very detailed. I admit you don't might
not need this for

91
00:06:13.000 --> 00:06:17.000
the proposal stage of your

92
00:06:17.000 --> 00:06:22.000
data management planning, but I think you will
need this

93
00:06:22.000 --> 00:06:25.000
during project planning later on.

94
00:06:25.000 --> 00:06:28.000
So at least for a data management plan

95
00:06:29.000 --> 00:06:34.000
that the European Commission is requesting
in the first six

96
00:06:34.000 --> 00:06:39.000
months of your project. You can see here that

97
00:06:39.000 --> 00:06:44.000
here's a table, a project management table
that collects

98
00:06:44.000 --> 00:06:48.000
all the information you have to produce for
the data description.

99
00:06:49.000 --> 00:06:51.000
Will you reuse data?

100
00:06:51.000 --> 00:06:56.000
Do you produce sensible data? Which data data
types will you produce

101
00:06:56.000 --> 00:07:01.000
in the different work packages? What data format
will the data

102
00:07:01.000 --> 00:07:05.000
have for the research output in the different
work packages? That might

103
00:07:05.000 --> 00:07:10.000
be the same for all of the work packages or
might be different for the

104
00:07:10.000 --> 00:07:15.000
work packages. Again, if you are using some
software that produces specific

105
00:07:15.000 --> 00:07:17.000
data formats, name the software.

106
00:07:18.000 --> 00:07:22.000
Then estimate which data volume

107
00:07:22.000 --> 00:07:25.000
you will produce in the different work packages,

108
00:07:25.000 --> 00:07:30.000
and for which target group this data or this
research output

109
00:07:30.000 --> 00:07:35.000
may be relevant. You can just put it into the
last column

110
00:07:35.000 --> 00:07:40.000
here. The next table

111
00:07:40.000 --> 00:07:45.000
is summarizing how you make the data

112
00:07:45.000 --> 00:07:48.000
Findable for each of the work packages.

113
00:07:49.000 --> 00:07:51.000
Here I listed

114
00:07:51.000 --> 00:07:54.000
some examples,

115
00:07:54.000 --> 00:07:57.000
what you have to put in the different columns,

116
00:07:57.000 --> 00:08:00.000
to make it findable. This could be done with
metadata.

117
00:08:01.000 --> 00:08:06.000
So if you're putting some specific metadata
alongside the data,

118
00:08:06.000 --> 00:08:09.000
it might be easier to find specific data.

119
00:08:09.000 --> 00:08:13.000
Maybe you have already some metadata standards
you're using.

120
00:08:14.000 --> 00:08:16.000
This can also make the data more findable,

121
00:08:16.000 --> 00:08:19.000
for you, for the project group,

122
00:08:19.000 --> 00:08:25.000
also for the community later on. A specific
data organization,

123
00:08:25.000 --> 00:08:28.000
which is meant by

124
00:08:28.000 --> 00:08:31.000
developing some file naming convention,

125
00:08:31.000 --> 00:08:33.000
or a folder structure

126
00:08:34.000 --> 00:08:37.000
to your data can also make the data findable.

127
00:08:37.000 --> 00:08:40.000
So if you're having a specific naming convention,

128
00:08:41.000 --> 00:08:44.000
you agree up on with all your collaborators,

129
00:08:44.000 --> 00:08:49.000
then of course the data will be better findable

130
00:08:49.000 --> 00:08:53.000
on your central data storage for all of your
collaborators.

131
00:08:53.000 --> 00:08:58.000
Later on, you have to put a persistent identifier
on

132
00:08:58.000 --> 00:09:00.000
the data when you publish it.

133
00:09:00.000 --> 00:09:03.000
Most of the repositories do this automatically.

134
00:09:04.000 --> 00:09:08.000
For instance, there is also the possibility
of getting a DOI,

135
00:09:08.000 --> 00:09:13.000
a digital object identifier, if you're publishing
data on a specific

136
00:09:13.000 --> 00:09:17.000
repository. And this might be the same for
all work packages or might be

137
00:09:17.000 --> 00:09:20.000
different for the work packages.

138
00:09:21.000 --> 00:09:24.000
Making data accessible

139
00:09:24.000 --> 00:09:27.000
you can describe with this table if you want.

140
00:09:28.000 --> 00:09:32.000
So who is responsible to give access to the
people

141
00:09:32.000 --> 00:09:35.000
in your project to the data?

142
00:09:35.000 --> 00:09:38.000
Who is responsible for giving access,

143
00:09:38.000 --> 00:09:43.000
for instance, to a network drive for the people
who need the access or

144
00:09:43.000 --> 00:09:46.000
for giving access to a cloud store?

145
00:09:47.000 --> 00:09:52.000
Will the data be open access maybe later in
this repository

146
00:09:52.000 --> 00:09:57.000
you just named before or you name the repository
again

147
00:09:57.000 --> 00:10:01.000
here. The European Commission wants you to
choose a trusted

148
00:10:01.000 --> 00:10:06.000
repository and there are websites where they
describe what they mean with trusted

149
00:10:06.000 --> 00:10:10.000
repository. I just put some examples here.

150
00:10:10.000 --> 00:10:15.000
Will there be a standard access protocol? Most
of the repositories have this but

151
00:10:17.000 --> 00:10:20.000
your network drive may not - that's no problem.

152
00:10:20.000 --> 00:10:25.000
Just describe, how will be the access managed
within the project and

153
00:10:25.000 --> 00:10:30.000
later on. For making

154
00:10:30.000 --> 00:10:34.000
the data interoperable, the same issues might

155
00:10:34.000 --> 00:10:37.000
solve this as the issues

156
00:10:37.000 --> 00:10:40.000
you already described in the findable part,

157
00:10:40.000 --> 00:10:43.000
but maybe for making data interoperable

158
00:10:43.000 --> 00:10:47.000
other metadata will be used. Just describe
this here.

159
00:10:48.000 --> 00:10:50.000
Again, if you're using some standards,

160
00:10:50.000 --> 00:10:55.000
just name this here, maybe you're using the
DataCite standard,

161
00:10:55.000 --> 00:11:00.000
maybe you're using a discipline specific standard
for engineering or

162
00:11:00.000 --> 00:11:03.000
for the social sciences. There are standards
you can

163
00:11:03.000 --> 00:11:05.000
have a look there.

164
00:11:05.000 --> 00:11:09.000
We can also guide you to some websites where
you will find standards.

165
00:11:10.000 --> 00:11:13.000
Are you using a standard vocabulary?

166
00:11:14.000 --> 00:11:16.000
Or have you developed a vocabulary,

167
00:11:17.000 --> 00:11:19.000
a glossary, or something like that

168
00:11:19.000 --> 00:11:21.000
within your project? Then just name it here.

169
00:11:22.000 --> 00:11:26.000
Are you using an ontology that has already
been published by

170
00:11:26.000 --> 00:11:31.000
a certain institution? Or maybe you are using
a vocabulary

171
00:11:31.000 --> 00:11:34.000
that comes with your measurement device, then
just name it here.

172
00:11:34.000 --> 00:11:38.000
If you know this at the state of the proposal,

173
00:11:38.000 --> 00:11:40.000
you can name this in your proposal.

174
00:11:41.000 --> 00:11:46.000
If not, then you don't have to already describe
this in this detail,

175
00:11:46.000 --> 00:11:49.000
but you can say, for instance, "We will

176
00:11:49.000 --> 00:11:53.000
have a look for standard vocabularies or for
standards.

177
00:11:53.000 --> 00:11:58.000
And if not, we will develop our own standards
and describe those standards."

178
00:12:00.000 --> 00:12:02.000
For making the data reusable,

179
00:12:03.000 --> 00:12:06.000
a good documentation helps:

180
00:12:07.000 --> 00:12:12.000
Will you put readme files or protocols or codebook
scripts and

181
00:12:12.000 --> 00:12:16.000
so on alongside your data? Just name it here
or describe it here.

182
00:12:17.000 --> 00:12:21.000
What data quality procedures will you develop?

183
00:12:22.000 --> 00:12:26.000
Maybe you have meetings from time to time on
data quality,

184
00:12:26.000 --> 00:12:30.000
or review the data with some colleagues?

185
00:12:30.000 --> 00:12:35.000
Maybe you have some checklists that help each
member of the project with

186
00:12:35.000 --> 00:12:37.000
data quality? Again, here,

187
00:12:38.000 --> 00:12:42.000
put in, which repository you are planning to
use later on so

188
00:12:42.000 --> 00:12:46.000
that people can download the data there for
reuse,

189
00:12:47.000 --> 00:12:51.000
people from your community. And here the license
is important.

190
00:12:52.000 --> 00:12:57.000
You have to put alongside your data a license
that describes

191
00:12:57.000 --> 00:13:01.000
who is allowed to use the data and for

192
00:13:01.000 --> 00:13:06.000
which purposes. The European community wants
you to put a CC -

193
00:13:06.000 --> 00:13:11.000
Creative Commons - CC license or CC by license
alongside

194
00:13:11.000 --> 00:13:16.000
your data. Most of the repositories offer this

195
00:13:16.000 --> 00:13:20.000
possibility to put a certain license alongside
the data.

196
00:13:20.000 --> 00:13:23.000
You can just click it there

197
00:13:23.000 --> 00:13:25.000
and then,

198
00:13:26.000 --> 00:13:30.000
the license is automatically attached to your
data.

199
00:13:33.000 --> 00:13:38.000
Storage: here you would just describe for each
work package,

200
00:13:38.000 --> 00:13:41.000
where you will put the data during the project,

201
00:13:41.000 --> 00:13:46.000
is there a backup to this secure storage, an
automatic backup

202
00:13:46.000 --> 00:13:50.000
or do you have to do the backup manually?

203
00:13:50.000 --> 00:13:53.000
How will you transfer data? Maybe this is already,

204
00:13:54.000 --> 00:13:56.000
described in all the other

205
00:13:57.000 --> 00:14:02.000
tables, maybe not. Because you have to transfer
data

206
00:14:02.000 --> 00:14:05.000
in a different way because you only have interfaces
between

207
00:14:05.000 --> 00:14:09.000
two partners or between two or three subprojects.

208
00:14:10.000 --> 00:14:14.000
And here you should describe how you are planning
for long term preservation of

209
00:14:14.000 --> 00:14:17.000
the data. If you follow the good scientific
practice,

210
00:14:17.000 --> 00:14:21.000
you are requested to archive the data for at
least 10 years.

211
00:14:21.000 --> 00:14:25.000
And here you can describe how you will do that.

212
00:14:25.000 --> 00:14:30.000
This might be the data management plan if

213
00:14:30.000 --> 00:14:33.000
you are filling in all those tables,

214
00:14:34.000 --> 00:14:39.000
as you go within your project,

215
00:14:39.000 --> 00:14:43.000
this might be the data management plan for
your

216
00:14:43.000 --> 00:14:45.000
group. And

217
00:14:45.000 --> 00:14:50.000
the data management plan you hand in for the
funding body might be

218
00:14:50.000 --> 00:14:55.000
a summary that you can create out of filling
in those tables,

219
00:14:55.000 --> 00:15:00.000
but it would also be okay to hand in those
tables as a

220
00:15:00.000 --> 00:15:02.000
data management plan for the European Commission.

221
00:15:02.000 --> 00:15:07.000
If everything is described in an understandable

222
00:15:07.000 --> 00:15:10.000
way for the European Commission in here.

