Hi,
thanks for reviewing and suggestions, @wolfgang8741, I made various changes and I tried to answer to you and explain more in details something. Let me know if something is still needed to be clarified.
In general, one of the biggest interest for collecting and studying this data is that we can not just study the data about the mobility of mass of people (as it has been done in the past by big telecom.companies etc.), but look at patterns of particular mobility, e.g. how researchers or educators are traveling around the world.
Why these categories? First, because in general, people of these professions are responsible for knowledge transfer (conferences participation, workshop for doctors etc.) and it would be very important to learn how they are traveling and what could we do together to use this knowledge, which belongs to society.
The added possible benefit of preprocessing of timelines would be offering turnkey data for an analysis at new resolutions. Also offering limits to the needs of projects, ie what timeframe is necessary for sharing (last week, last month, a user’s entire digital life?).
Indeed, I can clarify the the timeframe - usually it starts with one week of stay and is unlimited. But information about any stay in the place is relevant, in general.
Another important point is how to collect the information about places, where people stayed and had free time (which could be later used for social engagement or citizen science projects, which is essentially one of the goals of the project).
Since this project is only interested in scientists, what metadata would be necessary to determine the start date of an analysis for example even though the data doesn’t go far enough back for all, some are young enough scientist where the data may go back before they are a scientist. I don’t see in the survey or other collection how that may be determined in an analysis, but that is just something to think about:…
The problem here is that there is no real basis (except Easychair data which is not open, as far as I know, maybe @madprime@gedankenstuecke know some other databasis like this). Although indeed I could have suggests to submit Easychair data additionally, but this may be a bit difficult since not every conference is using Easychair.
How does one judge the “potential of the untapped” in this project?
There are many ways. For example, there is one project, which is mapping places around the world, where people try to identify places needed for volunteering e.g. https://www.hotosm.org/ and this may be useful for identifying untapped opportunities for visits by people.
@wolfgang8741 I also adjusted the question in the questionnaire:
“What are the countries are you working with? (but maybe have not visited yet)”
Where is the ORCID collected? (pardon if I missed this)
We collect Openhumans ID, and if the person participates in the questionnaire and leaves the information about his/her research institute, we deduce that a person is a scientist. This is good point, I also included ORCID.
I also changed the google form for questionnaires to make it more precise with countries (not continents you visited). Thank you!
analysis of our travels and meta-information of who is traveling can provide more information about how we can use it for social good
Yes, I am going indeed to change it, since this connotation has been miss-used recently and can be therefore understood in too many different ways. I changed it, thank you for the comment!!
@liubovv Sorry for the reply delay - response was buried in my inbox. The changes are moving this forward, its looking much clearer. Most of this is comments on the survey revisions to ensure you’ll get the data you need and to clarify previous points.
Yes and what I also was trying to get at here is given you’re giving the entire Google Location history how do you know when a person is classified as a “scientist” is it a date after they earned a degree or is it the date they entered a job or role as a “scientist”. This was a prompt to see if you have a measure to cut off what part of the location data timeline for the analysis. To look over an entire timeline it would include more than just their “scientific” career in some cases for those just earning their degrees may have longer history on Google than their scientific career. It would be good to think through possible analysis for what data is needed. While the ORCID is being collected it is not required that the education or other relevant information for this would be included for this threshold of where to stop analysis of the location history and it would be easier to identify this threshold up front than have to get that in the future. This is about how are you scoping the location analysis to not include irrelevant times or if you’re planning a full timeline analysis that you’re transparent why the entire timeline is being used.
What additional information might be helpful in looking at travel may be asking “What conferences and workshops do you attend?” so they can be compared to locations traveled which may allow predictive travels and identification of possible connections. Some of these can be inferred from publication, presentation, and other history, but explicit statements might include those not included in publications. This would also allow for identifying the conference geographic presence and identify potential future locations that may be visited as well as expose bias in the responses to certain researcher communities.
What I’m trying to get at with the prompt was that question 1. “What is the potential of the untapped connectedness and the connectivity of traveling researchers in the world?” is overly broad and hard to say it has been answered since we don’t know if we reached the potential (maybe the question just needs wording clarification). Did you mean? “What is the potential of the untapped connectedness of traveling researchers in the world?” Even with this revision this is more an exploratory statement which maybe stated as “What opportunities exist for new connectedness and knowledge sharing in the global traveling behavior of researchers?” in which you’re trying to identify the opportunities based on the connectedness instead of measuring the potential
Comments based on this new survey revision:
How can we benefit from our travels? (analysis of depersonalised information about our travels and meta-information of who is traveling can provide more information about how we can use it for social good)
Who is “we” referring to in the above question in the investigated? I would reword to state who “we” is or restate the question. I think “we” is referring to scientists, but is this the case? I may be over interpreting.
Authorization for adding data.
This project plans to add data to your Open Humans account, described below.
Not all persons will have an ORCID thus you may want to state “if provided” for the ORCID in the returned data. For those who are Openhumans members but don’t want to disclose their identifier I wonder if Openhumans has a way to provide different datasets. One with the ID and one without and tag
Example of the dataset collected: travel cityA → cityB of a scientist (metadata: ORCID of a scientist who travels).
Errors with the Google Form:
Currently the question wording needs adjustment and selection type need fixed: “What are countries you visited for CONFERENCES this year?” -> “Which countries have you visited for CONFERENCES this year?”
Selection of this question is currently uses a radial selection which is limited to one, but the way the question is stated suggests it should use a multi-select checkbox.
Currently the question wording needs clarification and the selection type need fixed: “What are countries you have been with PRIVATE VISITS this year?” -> “Which countries have you visited for PRIVATE VISITS in the past year?” Also I’m not sure what private visits are. Do you mean personal travel or non-conference travel or invited talks or do you mean something else?
Given both of the above questions are interested in the past year are you using the submission timestamp as the estimate of the past year on a rolling basis for the analysis or are you interested in a common time period between all respondents or is the intention to collect those in 2018? Did you mean “in the past 12 months”?
Currently the question wording needs clarification to be clearer on the desired format and response will provide the data you’re looking for. This reads as though it is double barreled asking what countries someone works with which would seek a comprehensive list of countries with working ties. While the following parenthesis suggests that this may only be of interest of those not traveled to should be listed. Is the intention with this question to collect all possible travel destinations or to gather those not visited, but are potential destinations? This is referencing: “What are the countries are you working with? (but maybe have not visited yet)” -> I think you’re trying to ask "Which countries are you working with
Not sure if this clarifies what you’re looking for - What I think you’re asking is what countries does someone have working relationship with regardless of travel which I would state as: “Which countries do you have a working relationship with regardless of having traveled there?” This still leaves “working relationship” to interpretation. Is this a working relationship with the community, study site, a co-author/institution, other? Also in asking this question do you also want to know what countries they worked with in the past year and may have ended before taking the survey or other timeframe or just currently? Might also be of benefit to specific (separate countries with a comma) or other delineater.
You may want to use validation for the ORCID response and at very least add a suggested format in the question description (ie https://orcid.org/0000-0003-0668-0089 or 0000-0003-0668-0089)
The travel questions are not required and it may be unclear if people miss the second question by scrolling too fast. You may want to split the lists of country questions to new pages or make them required and add the option “I have not traveled to any of the listed countries” if you think your list is comprehensive enough.
If this hasn’t been pilot tested it may be worth seeing if there are of the survey questions or have you tested these to see if they get the responses you’re looking for?
Hope these help, I’ll keep an eye out for a response for any timely reply necessary. (I’m still for an approve, but hoping to help produce the necessary data for the project to produce the most useful data for analysis).
Hi, @wolfgang8741
Thanks for comments. I reply to each of them separately, since this can be easier to read
So, for now we assumed that the survey is taken by people who are actually involved in scientific work.
We do not need more precise date of “when a person became a scientist”, since this sounds a bit strange right:)? But what we do care about is the trajectory of a person, who is related to scientific knowledge production. Why?
First, because the person can be involved in master then phd, then postdoc… etc. and this all is qualified as person involved in knowledge dissemination process.
Second, initially, the idea was to explore trajectories of such people (and not just scientists) since this could help to give more ideas about how knowledge exchange process could work in general.
So, to summarize the answer to your question, we do not need here information about the trajectory of a scientist, but we would like to understand the trajectory of a person, who is related to scientific knowledge flow around the world. I also tried to make it clear in project description. Thank you!
Tnx for the comment.
As we also already discuss with @gedankenstuecke and @veg some of such studies about analysis of publications have been done.
In fact we already cited them in the project and in our publication about multilayer network analysis for LeWiBo project [1,2].
So, what we could analyze here is complementary information to what we can get from analysis of publications from conferences proceedings.
I like your idea @wolfgang8741 to ask about conferences, which person visited, I should include it also in the google form, as optional question. This can be additional valuable information. Tnx!
Great, I just changed the form with adding additional options and I am also adding regular expressions for making ORCID collection properly. Tnx a lot for the comment!
Yes, exactly, here this question is about long-term scientific connections, and which can bring future trips since exchange trips are often based on this. (this is complementary and additional question)
I corrected the question in the google form. Thank you!
I come back to this comment. I added also additional question about date (year in this case since it may be hard to ask for a certain month) when people started to do research.
Hi @liubovv We may have a miscommunication here, I was not suggesting needing to know the trajectory of a person and used the “when did you become a scientist” as an example of a possible threshold. My interest was identification of the lower bounds of the analysis and how that point in time would be established from the information collected. I believe you addressed this threshold by adding the year as you stated in the post below.
My main concern was that given Google Timelines go at least back to 2009 (at least mine does) for some that could include high school or other times I would not consider part of my scientific knowledge production/dissemination and would not be in scope to this project based on what I’ve read.
I agree, having a year that you started in Academia in some way (maybe the year you started your PhD? As most undergrad/master students don’t travel a lot for conferences etc?) would be the best way to limit the data frame to the years of relevance to this study.
Hey, everyone,
thanks again you all for your feedback and for the ideas!
We recently discussed with @wolfgang8741 and @gedankenstuecke
Based on that I made several final changes for the project description.
I made the following changes in the project in order to make it more clear for people.
I removed the link to the google form in the first page of project description https://www.openhumans.org/activity/mobility-data-of-researchers/ ,
since we put the link to the google form in the detailed project description page in the field with Post-sharing URL.
In the project description page I also simplified the research question about
“What is the potential of the untapped connectedness and the connectivity of traveling researchers in the world?”
“How can the study of global researcher mobility data lead to actionable positive impact (for example, organisation of outreach to remote communities or contribution to problem solving for local communities around the globe)?”
I also included additional information in the project description about
complementary analysis of data about ORCID. With ORCID information one can also analyze information about publications (and hence analyze information about affiliations with the universities, in which the researcher was working in the past).
This can help to analyze the knowledge dissemination process on a wider time-range.
At the same time since not each participant of the project has ORCID,
I made the field ORCID in the google form as additional and non required.
In the google form I changed the question about “the date of Phd start”:
Which year you started your Phd? (which year you consider to start your Phd or equivalent research activity)
I added additional section in the google form so that it makes it more clear to
answer questions and see separate questions.
I also explained the pronoun “We (as researchers and science community)” in the description of the project.
I also wanted to thank a lot Jonathan, who spent sufficient amount of time of revising the project to become more clear.
Thank you for discussions again.
Hopefully this replies to all the questions we discussed these weeks.
Agreed. I think a “meta” is that it would be nice to capture all these thoughts in better ways – regarding the ways in which location data, in particular, might be processed and shared in ways that provide more options for sharing. In many ways it’s a broad design/architecture/ecosystem: we’d like the ecosystem (really broadly speaking) to understand and support the granularity people would like. (It’s hard! work to implement and maintain, potential liability in that, and also hard to fully anticipate desired uses – but there’s definitely a need/desire!)
I’m biased as I’ve been involved in setting up the project etc. But I think we already had a consensus for approval prior to those changes, as those changes were largely related to making the data that’s being collected more useful and not to address systemic issues or problems with the project itself?
Based on this I’d say we have reached consensus and can approve?