ConviviaR Tools: RIRO - Russian Index of the Research Organizations

Aleksei Lutai; Ivan Sterligov

RIRO is a public, open, and autonomous project, designed to link the numerous identifiers of the Russian organizations to the existing legal entities.

Why do we need RIRO?

The databases mentioned above have assigned the identifiers to the organizations and put some efforts into curation process, but none of them is confident when it comes to:

As a result, the identifiers provided by the international services suffer from a patchy coverage and have a limited value for the research assessment and scientometric analysis.

Selection Criteria for Organizations

As RORI is about research, any organization that appears in the affiliations texts of the scientific publications is a good candidate for RORI. And yet some organizations have been prioritized for v.1.0, those are:

We have not (even) reviewed the entities associated with the Russian state corporations (RosAtom, RosSpace, RosTech) and the military ones.

RIRO dataset

The RIRO dataset v.1.0 is a set of CSV tables, having a primary key named code, so one can easily join the tables or build a database.

Below I demonstrate how to download the dataset using zen4R package. This will require of you to do the following actions:

I selected these organizations as they came through a chain of reorganizations in the last 10 years.

Table 1 - Official Info

Table 1 comprises the basic organization details - OGRN (Primary State Registration Number), INN (Taxpayer Identification Number), KPP (Tax Registration Reason Code), full & short names, status {active, liquidated, in reorganization process}, and the branch type {head or branch}.

A table below lists the legal entities found by OGRN and their branches (the branch organization and its head entity share same OGRN and INN). To save some space a column with the short names column is not included.

Search:

code	level	status	name_full	ogrn	inn	kpp
n4p9t7	Head	Действующая организация	ФЕДЕРАЛЬНОЕ ГОСУДАРСТВЕННОЕ БЮДЖЕТНОЕ УЧРЕЖДЕНИЕ НАУКИ ФЕДЕРАЛЬНЫЙ ИССЛЕДОВАТЕЛЬСКИЙ ЦЕНТР КОМИ НАУЧНЫЙ ЦЕНТР УРАЛЬСКОГО ОТДЕЛЕНИЯ РОССИЙСКОЙ АКАДЕМИИ НАУК	1021100511332	1101481574	110101001
w6n9h2	Head	Действующая организация	ФЕДЕРАЛЬНОЕ ГОСУДАРСТВЕННОЕ БЮДЖЕТНОЕ НАУЧНОЕ УЧРЕЖДЕНИЕ ТОМСКИЙ НАЦИОНАЛЬНЫЙ ИССЛЕДОВАТЕЛЬСКИЙ МЕДИЦИНСКИЙ ЦЕНТР РОССИЙСКОЙ АКАДЕМИИ НАУК	1027000861568	7019011979	701701001
i7x2s7	Branch	Действующая организация	ТЮМЕНСКИЙ КАРДИОЛОГИЧЕСКИЙ НАУЧНЫЙ ЦЕНТР - ФИЛИАЛ ФЕДЕРАЛЬНОГО ГОСУДАРСТВЕННОГО БЮДЖЕТНОГО НАУЧНОГО УЧРЕЖДЕНИЯ ТОМСКИЙ НАЦИОНАЛЬНЫЙ ИССЛЕДОВАТЕЛЬСКИЙ МЕДИЦИНСКИЙ ЦЕНТР РОССИЙСКОЙ АКАДЕМИИ НАУК	1027000861568	7019011979	720343001
y5b3t9	Head	Действующая организация	ФЕДЕРАЛЬНОЕ ГОСУДАРСТВЕННОЕ БЮДЖЕТНОЕ ОБРАЗОВАТЕЛЬНОЕ УЧРЕЖДЕНИЕ ВЫСШЕГО ОБРАЗОВАНИЯ МИРЭА - РОССИЙСКИЙ ТЕХНОЛОГИЧЕСКИЙ УНИВЕРСИТЕТ	1037739552740	7729040491	772901001
s8t3r2	Branch	Действующая организация	ФИЛИАЛ ФЕДЕРАЛЬНОГО ГОСУДАРСТВЕННОГО БЮДЖЕТНОГО ОБРАЗОВАТЕЛЬНОГО УЧРЕЖДЕНИЯ ВЫСШЕГО ОБРАЗОВАНИЯ МИРЭА - РОССИЙСКИЙ ТЕХНОЛОГИЧЕСКИЙ УНИВЕРСИТЕТ В Г. СТАВРОПОЛЕ	1037739552740	7729040491	263543001

Showing 1 to 5 of 6 entries

Previous1 2Next

Each row has its unique “code” which serves as a primary key for all the RIRO tables.

Table 2 - Locations and Geodata

Table 2 comprises the full address and its separate parts (in Russian), accompanied with the geocode, geo coordinates and time zone. Table 2 is the only table in RIRO that corresponds 1:1 to Table 1 via “code”. The other tables can have few rows for unique code.

Search:

code	geoname_id	lat	lon	time	federal_district	address_full	postal_code	region_type	region	city	street_type	street	house_type	house
n4p9t7	485239	61.6660239	50.8235666	UTC+3	Северо-Западный	167982, РЕСПУБЛИКА КОМИ, ГОРОД СЫКТЫВКАР, УЛИЦА КОММУНИСТИЧЕСКАЯ, ДОМ 24	167000	республика	Коми	Сыктывкар	улица	Коммунистическая	дом	24
w6n9h2	1489425	56.490009	84.9458326	UTC+7	Сибирский	634009, ОБЛАСТЬ ТОМСКАЯ, ГОРОД ТОМСК, ПЕРЕУЛОК КООПЕРАТИВНЫЙ, 5	634009	область	Томская	Томск	переулок	Кооперативный	дом	5
i7x2s7	1488754	57.134939	65.5663861	UTC+5	Уральский	625026, ОБЛАСТЬ ТЮМЕНСКАЯ, ГОРОД ТЮМЕНЬ, УЛИЦА МЕЛЬНИКАЙТЕ, ДОМ 111	625026	область	Тюменская	Тюмень	улица	Мельникайте	дом	111
y5b3t9	524901	55.6733309	37.4801249	UTC+3	Центральный	119454, ГОРОД МОСКВА, ПРОСПЕКТ ВЕРНАДСКОГО, 78	119454	город	Москва	Москва	проспект	Вернадского	дом	78
s8t3r2	487846	45.0521865	41.9126233	UTC+3	Северо-Кавказский	355035, КРАЙ СТАВРОПОЛЬСКИЙ, ГОРОД СТАВРОПОЛЬ, ПРОСПЕКТ КУЛАКОВА, ДОМ 8, КВ-Л 601	355035	край	Ставропольский	Ставрополь	проспект	Кулакова	дом	8

Showing 1 to 5 of 6 entries

Previous1 2Next

Table 3 - Hierarchy

This is a very important table, as it links the parent organizations not only to its existing branches, but also to the predecessors (for convenience, in this document both will be referred as “children accounts”).

The “child_code” is a code for the children account, the values in the “relation” column reflect a nature of their subordination (it can be a branch or a predecessor).

Search:

Showing 1 to 5 of 39 entries

Previous1 2 3 4 5…8Next

Thus, using a list of OGRNs for 3 selected organizations, we extracted from Table 1 6 entities with unique codes (head & branch organizations), further used to retrieve a list of all the predecessors. As a result we have build a list of 42 entities with unique code values.

We will use this hierarchy to gather the matched identifiers from other RIRO tables (see below).

Table 4 - ROR

Research Organizations Registry (ROR) is an international project launched in 2019 with an ambitious goal to create a public ORCID-like registry for the research organizations. It inherits a lot from GRID (Global Research Identifier Database) and (I guess) from Wikidata. The ROR organization info can be downloaded as a JSON dump or retrieved via API.

Table 4 contains not all the Russian records from ROR, but only those that we matched to the organizations present in RORI.

Search:

code	ror_id	grid	ror_name	ror_name_rus	ror_relationships	ror_city	ror_links	ror_aliases	ror_acronyms	ror_status
t5p4t4	00029be75	grid.483432.a	Institute of Biology of Komi Scientific Centre	Института биологии Коми НЦ УрО РАН	label:Russian Academy of Sciences\|type:Parent\|id:https://ror.org/05qrfxd25	Syktyvkar	http://ib.komisc.ru/en/	nstitute of Biology of Komi Scientific Centre of the Ural Branch of the Russian Academy of Sciences	IB Komi SC UB RAS	active
t6y7n3	04ggfw624	grid.473258.f	Institute of Chemistry, Komi Science Center	Коми научный центр УрО РАН	label:Department of Chemistry and Material Sciences\|type:Parent\|id:https://ror.org/059tqvg48	Syktyvkar	http://chemi.komisc.ru/en/	Komi SC		active
r2r5g5	02k9pct19	grid.494805.4	Institute of Physiology, Komi Science Center	Институт экологической физиологии Уральского отделения	label:Ural Branch of the Russian Academy of Sciences\|type:Parent\|id:https://ror.org/02s4h3z39	Syktyvkar	http://physiol.komisc.ru			active
h4p9g9	01tpj7340	grid.501775.1	Institute of Geology, Komi Science Centre	Институт геологии Коми НЦ УрО РАН	label:Ural Branch of the Russian Academy of Sciences\|type:Parent\|id:https://ror.org/02s4h3z39	Syktyvkar	http://geo.komisc.ru/en/	Institute of Geology, Komi Science Centre, Russian Mineralogical Society		active
w6n9h2	01z0w8p93	grid.473330.0	Tomsk National Research Medical Center	Томский национальный исследовательский медицинский центр	label:Russian Academy of Sciences\|type:Parent\|id:https://ror.org/05qrfxd25	Tomsk	http://www.tnimc.ru/en/		TNRMC	active

Showing 1 to 5 of 12 entries

Previous1 2 3Next

The column “Relationships” have the composite values of following structure:

with 3 units (label, type, id) for the relative (according to ROR) organizations.

But in cases like our example such references can be misleading! According to ROR the research institutes of the Komi Federal Research Center have different parents:

So for a single organization ROR shows 4 accounts subordinating to three different RAS structures.

The actual truth is that the found 4 research institutes of the Komi Federal Research Center ceased to exist as legal entities 3 years ago, they were merged into one federal research center in May 2018.

Using the Table 3, one can gather the related identifiers and qualify them as corresponding to a branch or a predecessor.

Table 5 - Wikidata

WikiData is a public repository of structured data originating from multiple sources. Some sources are more or less consistent (like CrossRef or ISSN), but there’s also a lot of Wikidata records that are created and modified by people. As a result, even though Wikidata offers a pre-defined templates for the universities or research organizations, Wikidata profiles have a lot of unpopulated fields.

The table 5 comprises a list of fields found in the Wikidata profiles for the organizations, but not a full copy. Moreover, Table 5 include only those Russian research organizations that match to the organizations in RIRO.

Search:

code	wikidata	wd_itemlabel	wd_rus_orgname	wd_eng_orgname	wd_web_site	wikipedia_eng	wikipedia_rus	wd_itemaltlabel	wd_altlabel	wd_oldnames	wd_isni	wd_elibrary	wd_grid	wd_ror	wd_google	wd_mag	wd_qs	wd_the	wd_ria	wd_umultirank	wd_arwu	wd_cr_funder	wd_doi_pref	wd_twitter	wd_facebook	wd_twitter_id	wd_youtube

code	wikidata	wd_itemlabel	wd_rus_orgname	wd_eng_orgname	wd_web_site	wd_itemaltlabel	wd_altlabel	wd_isni	wd_grid	wd_ror
n4p9t7	Q4229557	Institute of Chemistry, Komi Science Center	Коми научный центр УрО РАН	Institute of Chemistry, Komi Science Center	chemi.komisc.ru	Komi SC...	Коми научный центр РАН...	0000 0000 9097 8504	grid.473258.f	04ggfw624
n4p9t7	Q61931541	Institute of Geology Komi SC UB RAS	Институт геологии Коми НЦ УрО РАН	Institute of Geology Komi SC UB RAS	geo.komisc.ru	IG KSC UB RAS, Institute of Geology KSC ...	Институт геологии Коми научного центра У...		grid.501775.1	01tpj7340
t5p4t4	Q33121401	Institute of Biology of Komi Scientific Centre	Института биологии Коми НЦ УрО РАН	Institute of Biology of Komi Scientific Centre	ib.komisc.ru	IB Komi SC UB RAS, Institute of Biology ...			grid.483432.a	00029be75
b6h7x5	Q30263248	Research Institute of Pharmacology and Regenerative Medicine named ED Goldberg	Научно-исследовательский институт фармакологии и регенеративной медицины имени Е.Д. Гольдберга	Research Institute of Pharmacology and Regenerative Medicine named ED Goldberg	pharmso.ru	Federal State Scientific Institution "Re...		0000 0004 4657 0877	grid.465396.b	44171329
h2v5i8	Q30263200	Research Institute of Medical Genetics of Russian Academy of Medical Sciences	Научно-исследовательский институт медицинской генетики	Research Institute of Medical Genetics of Russian Academy of Medical Sciences	medgenetics.ru	Federal State Scientific Institution "Re...		0000 0004 0620 3511	grid.465310.5	029k7k053

Showing 1 to 5 of 9 entries

Previous1 2Next

Table 6 - Scopus

Scopus is a (one of leading) citation index accumulating the metadata from 20k+ journal titles, selected conference sources, and some academic book titles. Table 6 lists the Scopus affiliation profiles matched to the organizations in RIRO, and also a number of publications under Scopus affiliation profile (on a data of request, April 2021).

Scopus Affiliation IDs can be used for search queries via online UI or the API-service. The latter has few wrappers for python and R that make working with API more comfortable.

Please note that matching the Scopus affiliation profiles to RIRO organizations is based on the affiliation name and location. It does not guarantee that all the publications in the profile are assigned to it correctly. More details on how to edit the affiliation profiles in Scopus can be found on Elsevier web site.

Search:

code	scopus_id	scopus_affil_name	scopus_affil_name_variants	scopus_affil_city	scopus_affil_country	scopus_affil_pubs
n4p9t7	60004454	Komi Science Centre Ural Division, Russian Academy of Sciences	Russian Academy Of Sciences	Syktyvkar	Russian Federation	2425
t5p4t4	60105357	Institute of Biology of the Komi Science Centre of the Ural Branch of the Russian Academy of Sciences	Russian Academy Of Sciences	Syktyvkar	Russian Federation	1204
x4x3v2	60110362	Institute of Socio- Economic & Energy Problems of North of the Komi Science Centre of the Ural Branch of the Russian Academy of Sciences	Institute Of Socio-economic And Energy Problems Of The North	Syktyvkar	Russian Federation	45
t6y7n3	60110359	Institute of Chemistry of the Komi Science Centre of the Ural Branch of the Russian Academy of Sciences	Institute Of Chemistry	Syktyvkar	Russian Federation	656
r2r5g5	60110361	Institute of Physiology of the Komi Scientific Center of the Ural Branch of the Russian Academy of Sciences	Institute Of Physiology	Syktyvkar	Russian Federation	434

Showing 1 to 5 of 20 entries

Previous1 2 3 4Next

Table 7 - Microsoft Academic

Microsoft Academic Graph (MAG) is a database created based on the information extracted with Bing-parsers from the publisher web sites and PDF files details). This approach is different from the one utilized by Web of Science and Scopus that receive a large part of information for indexation directly from the publishers.

Even though the last news about MAG shocked us too, we decided to include the MAG Organization IDs into RORI. Few international companies committed to launching a new tool that may substitute MAG:

Search:

code	mag_id	mag_orgname
y5b3t9	48336290	moscow state institute of radio engineering electronics and automation
n2b3b6	98594063	moscow state university of instrument engineering and computer science
z7p3b7	208875869	moscow state university of fine chemical technologies

Showing 1 to 3 of 3 entries

Previous1Next

Table 8 - InCites

InCites is an analytical solution build over Web of Science Core Collection. It allows to export the records, so the matched names can be used for further analysis.

The table 8 lists the official organization names in InCite and Web of Science Core Collection against the RORI codes.

Search:

code	incites_orgname	wos_orgname	wos_pubs_2019_21
n4p9t7	Komi Science Centre of the Ural Branch of the Russian Academy of Sciences	Komi Science Centre Of The Ural Branch Of The Russian Academy Of Sciences	599
t5p4t4	Institute of Biology, Komi Scientific Centre, Ural Branch RAS	Institute Of Biology Komi Scientific Centre Ural Branch Ras	275
t6y7n3	Institute of Chemistry, Komi Scientific Centre, Ural Branch RAS	Institute Of Chemistry Komi Scientific Centre Ural Branch Ras	149
r2r5g5	Institute of Physiology, Komi Scientific Center of the Russian Academy of Sciences	Institute Of Physiology Komi Scientific Center Of The Russian Academy Of Sciences	70
h4p9g9	Institute of Geology, Komi Scientific Centre of the Russian Academy of Sciences	Institute Of Geology Komi Scientific Centre Of The Russian Academy Of Sciences	122

Showing 1 to 5 of 11 entries

Previous1 2 3Next

Table 9 - SciVal

SciVal is an analytical tool build over Scopus. Some Russian organizations have an access to SciVal API and could use the IDs matched against the RORI codes in the table 9.

Search:

code	scival_id	scival_institution_name
t6y7n3	719642	Institute of Chemistry
w6n9h2	719242	RAS - Tomsk National Research Medical Сепtеr
b6h7x5	719248	RAS - Goldberg Research Institute of Pharmacology and Regenerative Medicine
y6t2i3	719244	RAS - Cardiology Research lnstitute, Tomsk National Research Medical Center
y5b3t9	708852	Moscow Technological University

Showing 1 to 5 of 5 entries

Previous1Next

Table 10 - Russian Universities Assessment System

This system governed by the Russian Ministry of Science & Higher Education, collects the various statistical reports from all Russian higher education institutions (excluding some schools under the Ministry of Defence and alike). Such reports contain a lot of useful information - from financial to enrollment data.

Table 10 lists the IDs that corresponds to the university’s web page on the portal, matched to the RORI codes.

Table 11 - Web of Science

Web of Science is by far the world’s oldest and most prominent citation index. At this moment Web of Science does not provide the organization IDs that could be used for search or data retrieval, but the search results have the orgaization names. The table 11 lists almost 4000 such names matched to the organizations in RIRO. This is not a complete list of known affiliation names for the Russian research organizations, but we hope to adjust this table in future releases of RIRO.

Search:

code	wos_name_variant	wos_main	wos_incites_main_name
n4p9t7	Komi Science Centre Of The Ural Branch Of The Russian Academy Of Sciences	YES	Komi Science Centre Of The Ural Branch Of The Russian Academy Of Sciences
t5p4t4	Institute Of Biology Komi Scientific Centre Ural Branch Ras	YES	Institute Of Biology Komi Scientific Centre Ural Branch Ras
t6y7n3	Institute Of Chemistry Komi Scientific Centre Ural Branch Ras	YES	Institute Of Chemistry Komi Scientific Centre Ural Branch Ras
r2r5g5	Institute Of Physiology Komi Scientific Center Of The Russian Academy Of Sciences	YES	Institute Of Physiology Komi Scientific Center Of The Russian Academy Of Sciences
h4p9g9	Inst Geol Komi Sci Ctr Ub Ras	NO	Institute Of Geology Komi Scientific Centre Of The Russian Academy Of Sciences

Showing 1 to 5 of 33 entries

Previous1 2 3 4 5 6 7Next

All IDs for 3 organizations

Now, as we have glanced at the identifiers found in each table for 3 selected organizations, we are ready for a wider picture.

An illustration below shows the identifiers (ROR, GRID, Scopus Affiliation ID, InCites ID, MAG, Wikidata, 1-Monitoring) matched to parent organizations, branches and predecessors - each organization in a separate section. The identifiers are placed along X-axis (by organization). The entities (RORI records) are placed along Y-axis – the existing organizations are shown as squares (the parent organizations are marked with a special sign), the predecessors as circles.

This picture is not an example of clarity, I admit, so let me explain it more thoroughly. The identifiers for each organization are present in a separacte section (from left to right). Each section has its own number of horizontal rows corresponding to the parent organization, branches and predecessors (from top to bottom). Marked as squares are the existing organizations - either a parent one (also marked with a sign, on top) or the branches (under the parent). Marked as circles (with less saturated colours) are the identifiers corresponding to the predecessors (liquidated organizations).

Or in a more plain language - every circle is an organization identifier that corresponds to a predecessor, not to existing legal entity.

RIRO’s scope?

An UpSet diagram below shows the sets of organizations listen in RIRO (v.1.0) and matched to the various identifiers. Only the head and active (existing) organizations are counted here, so the identifiers referring to the branches or to the predecessors (liquidated via acquisition), are not counted.

RIRO v.1.0 lists 1774 existing parent organizations. Many of those have no external identifiers, but (as previous picture proved) may have a lot of identifiers corresponding to the branches or predecessors.

Additional Notes

Feedback

So far RIRO is developed by only 2 persons - me and Ivan Sterligov orcid - so we decided to start with a google form (in Russian) with 5 pre-defined scenarios of change requests:

RIRO - Russian Index of the Research Organizations

Authors

Affiliations

Published

Citation

What is RIRO?

Why do we need RIRO?

Selection Criteria for Organizations

RIRO dataset

Table 1 - Official Info

Table 2 - Locations and Geodata

Table 3 - Hierarchy

Table 4 - ROR

Table 5 - Wikidata

Table 6 - Scopus

Table 7 - Microsoft Academic

Table 8 - InCites

Table 9 - SciVal

Table 10 - Russian Universities Assessment System

Table 11 - Web of Science

All IDs for 3 organizations

RIRO’s scope?

Additional Notes

Feedback

Acknowledgments

Footnotes

Reuse

Citation

code	monitoring_id
s8t3r2	14013186
r2i9t5	14013178
y5b3t9	134