Corpus Title: AIDA Phase 3 Practice Topic Source Data and Annotation LDC Catalog-ID: LDC2025T02 Authors: Jennifer Tracey, Stephanie Strassel, Jeremy Getman, Ann Bies, Kira Griffitt, David Graff, Chris Caruso This corpus was developed by the Linguistic Data Consortium for the DARPA AIDA Program and contains a multi-media collection of 1417 documents used for the AIDA Scenario 3 practice topics, as well as annotations for 279 of those documents. The AIDA (Active Interpretations of Disparate Alternatives) Program is designed to support development of technology that can assist in cultivating and maintaining understanding of events when there are conflicting accounts of what happened (e.g. who did what to whom and/or where and when events occurred). AIDA systems must extract entities, events, and relations from individual multimedia documents, aggregate that information across documents and languages, and produce multiple knowledge graph hypotheses that characterize the conflicting accounts that are present in the corpus (see https://www.darpa.mil/program/active-interpretation-of-disparate-alternatives for more information about the program). Each phase of the AIDA program focused on a different scenario, or broad topic area. The scenario for Phase 3 was the ongoing COVID-19 global pandemic. In addition, each scenario had a set of specific subtopics within the scenario that were designated as either "practice topics" (released as for use in system development) or "evaluation topics" (reserved for use in the AIDA program evaluations for each phase). The annotation in this release includes: exhaustive event, relation, and entity mention annotation for 64 documents; claim frame annotation (including supporting event, relation, and entity annotations) for 203 documents; and a separate set of 30 practice topic query claim frames from 30 documents (1 query per document). Claim frame annotation was a new task developed for the final phase of AIDA and consisted of identifying claims (i.e. statements claiming that something was or was not true) relevant to the COVID-19 pandemic. Additional details about the claim frame annotation can be found in section 7.2 below. The claim frame annotation in this release is comprised of claim annotations manually produced by: LDC; University of Colorado Boulder; Johns Hopkins University; Language Technologies Institute, Carnegie Mellon University; and University of Illinois Urbana-Champaign. Additional details about the claim frame task is provided below and in the documentation for this corpus. The Scenario 3 practice topics covered in this annotation are listed in AIDA_Phase3_Practice_Topics_Templates.xlsx in the /docs directory. 2.0 Directory Structure The directory structure and contents of the package are summarized below -- paths shown are relative to the base (root) directory of the package: ./data/source/ -- contains zip files subdivided by data type (see below) ./data/annotation/ -- contains subdirectories of TA1 and TA3 annotation ./docs/ -- contains documentation about the source data and annotation ./tools/ -- contains software for LTF data manipulation The "data" directory has a separate subdirectory for each of the following data types, and each directory contains one or more zip archives with data files of the given type; the list shows the archive-internal directory and file-extension strings used for the data files of each type: bmp/*.bmp.zip -- contains "bmb/*.bmp.ldcc" files (image data) gif/*.gif.zip -- contains "gif/*.gif.ldcc" files (image data) jpg/*.jpg.zip -- contains "jpg/*.jpg.ldcc" files (image data) mp4/*.mp4.zip -- contains "mp4/*.mp4.ldcc" files (typically video) png/*.png.zip -- contains "png/*.png.ldcc" files (image data) svg/*.svg.zip -- contains "svg/*.svg.ldcc" files (image data) ltf/*.ltf.zip -- contains "ltf/*.ltf.xml" (segmented/tokenized text data) psm/*.psm.zip -- contains "psm/*.psm.xml" (stand-off mark-up for ltf.xml) Data types in the first group consist of original source materials presented in "ldcc wrapper" file format (see section 4.2 below). The latter group (ltf and psm) are created by LDC from source HTML data, by way of an intermediate XML reduction of the original HTML content for "root" web pages (see section 4.1 for a description of the process, and section 5 for details on the LTF and PSM file formats). The 6-character file-ID of the zip archive matches the first 6 characters of the 9-character file-IDs of the data files it contains. For example: zip archive file ./data/png/KC003A.png.zip contains: png/KC003A5YS.png.ldcc png/KC003A7YK.png.ldcc ... png/KC003A7YG.png.ldcc png/KC003A7T1.png.ldcc (The "ldcc" file format is explained in more detail in section 4.2 below.) Note that the number of data files per zip archive varies quasi-randomly, and could range to over 40,000 files per archive. (In the present release, the largest single zip archive has 3390 files.) 3.0 Content Summary 3.1 Source Data "#RtPgs" refers to the number of root HTML pages that were scouted and harvested; the other columns indicate the total number of unique data files of the various types extracted from those root pages. #RtPgs #Text #Imgs 1417 1139 10308 3.2 Annotation Data ./data/annotation/TA1/ contains exhaustive event, realtion, and entity annotation for a total of 64 unique documents, with the following distribution across language: Language Docs ENG 24 SPA 21 RUS 19 ./data/annotation/TA3/ contains claim frame annotations from five teams of annotators, with the following counts of docs and claims annotated by each team. Team Docs Claims LDC 58 637 Colorado 150 1397 CMU 26 178 JHU 148 1890 UIUC 16 81 The TA3 directory also contains a set of 30 claim frame queries produced by LDC, one from each of 30 documents with the following distribution across language: Language Docs ENG 18 SPA 6 RUS 6 CLaim frame queries were example claim frames intended to be used by systems as "queries" to extract similar claims from additional documents. 4.0 Data Processing and Character Normalization Most of the content has been harvested from various web sources using an automated system that is driven by manual scouting for relevant material. Some content may have been harvested manually, or by means of ad-hoc scripted methods for sources with unusual attributes. 4.1 Treatment of original HTML text content All harvested HTML content was initially converted from its original form into a relatively uniform XML format; this stage of conversion eliminated irrelevant content (menus, ads, headers, footers, etc.), and placed the content of interest into a simplified, consistent markup structure. The "homogenized" XML format then served as input for the creation of a reference "raw source data" (rsd) plain text form of the web page content; at this stage, the text was also conditioned to normalize white-space characters, and to apply transliteration and/or other character normalization, as appropriate to the given language. This processing creates the ltf.xml and psm.xml files for each harvested "root" web page; these file formats are described in more detail in section 5 below. 4.2 Treatment of non-HTML data types: "ldcc" file format To the fullest extent possible, all discrete resources referenced by a given "root" HTML page (style sheets, javascript, images, media files, etc.) are stored as separate files of the given data type, and assigned separate 9-character file-IDs (the same form of ID as is used for the "root" HTML page). In order to present these attached resources in a stable and consistent way, the LDC has developed a "wrapper" or "container" file format, which presents the original data as-is, together with a specialized header block prepended to the data. The header block provides metadata about the file contents, including the MD5 checksum (for self-validation), the data type and byte count, url, and citations of source-ID and parent (HTML) file-ID. The LDCC header block always begins with a 16-byte ASCII signature, as shown between double-quotes on the following line (where "\n" represents the ASCII "newline" character 0x0A): "LDCc \n1024 \n" Note that the "1024" on the second line of the signature represents the exact byte count of the LDCC header block. Immediately after the 16-byte signature, a YAML string presents a data structure comprising the file-specific header content, expressed as a set of "key: value" pairings in UTF-8 encoding. The YAML string is padded at the end with space characters, such that when the following 8-byte string is appended, the full header block size is exactly 1024 bytes (or whatever size is stated in the initial signature): "endLDCc\n" In order to process the content of an LDCC header: - read the initial block of 1024 bytes from the *.ldcc data file - check that it begins with "LDCc \n1024 \n" and ends with "endLDCc\n" - strip off those 16- and 8-byte portions - pass the remainder of the block to a YAML parser. In order to access the original content of the data file, simply skip or remove the initial 1024 bytes. 5.0 Overview of XML Data Structures 5.1 PSM.xml -- Primary Source Markup Data The "homogenized" XML format described above preserves the minimum set of tags needed to represent the structure of the relevant text as seen by the human web-page reader. When the text content of the XML file is extracted to create the "rsd" format (which contains no markup at all), the markup structure is preserved in a separate "primary source markup" (psm.xml) file, which enumerates the structural tags in a uniform way, and indicates, by means of character offsets into the rsd.txt file, the spans of text contained within each structural markup element. For example, in a discussion-forum or web-log page, there would be a division of content into the discrete "posts" that make up the given thread, along with "quote" regions and paragraph breaks within each post. After the HTML has been reduced to uniform XML, and the tags and text of the latter format have been seprated, information about each structural tag is kept in a psm.xml file, preserving the type of each relevant structural element, along with its essential attributes ("post_author", "date_time", etc.), and the character offsets of the text span comprising its content in the corresponding rsd.txt file. 5.2 LTF.xml -- Logical Text Format Data The "ltf.xml" data format is derived from rsd.txt, and contains a fully segmented and tokenized version of the text content for a given web page. Segments (sentences) and the tokens (words) are marked off by XML tags (SEG and TOKEN), with "id" attributes (which are only unique within a given XML file) and character offset attributes relative to the corresponding rsd.txt file; TOKEN tags have additional attributes to describe the nature of the given word token. The segmentation is intended to partition each text file at sentence boundaries, to the extent that these boundaries are marked explicitly by suitable punctuation in the original source data. To the extent that sentence boundaries cannot be accurately detected (due to variability or ambiguity in the source data), the segmentation process will tend to err more often on the side of missing actual sentence boundaries, and (we hope) less often on the side of asserting false sentence breaks. The tokenization is intended to separate punctuation content from word content, and to segregate special categories of "words" that play particular roles in web-based text (e.g. URLs, email addresses and hashtags). To the extent that word boundaries are not explicitly marked in the source text, the LTF tokenization is intended to divide the raw-text character stream into units that correspond to "words" in the linguistic sense (i.e. basic units of lexical meaning). 6.0 AIDA Teams Claim Frame Annotations The claim frame annotation in this release is comprised of claim annotations manually produced by: LDC; University of Colorado Boulder; Johns Hopkins University; Language Technologies Institute, Carnegie Mellon University; and University of Illinois Urbana-Champaign. AIDA teams that manually produced TA3 annotations provided their annotations to LDC, and LDC performed various format corrections to ensure that the file conventions used across teams (including LDC) are consistent. The contents of teams' annotations were not altered in any way; no quality control was done and no corrections were made by LDC to teams' annotations. Only formatting changes were made. By design, the teams were allowed to develop their own appraoches to the claim frame task, following the annotation principles laid out in the Annotation_Principles_for_Claim_Frame_Annotation_AIDA_Phase_3_TA3_V2.1.pdf document found in the docs/ directory of this corpus. Teams were free to develop their own annotation approaches, GUIs, training, etc. within this framework. The /docs directory contains separate READMEs provided by each team with information they provided regarding the appraoch taken by their team. 6.1 Team Attributions The below attributions were provided by each team. University of Colorado Boulder: Claim frame annotation done at the University of Colorado Boulder through efforts coordinated by Michael Regan and Martha Palmer with annotators Adam Pollins and Ahmed ELsayed. Johns Hopkins University: Xia, Patrick; Qin, Guanghui; Vashishtha, Siddharth; Chen, Yunmo; Chen, Tongfei; May, Chandler; Harman, Craig; Rawlins, Kyle; White, Aaron Steve;, and Van Durme, Benjamin. 2021. LOME: Large Ontology Multilingual Extraction. EACL-Demos 2021. https://aclanthology.org/2021.eacl-demos.19/ Language Technologies Institute, Carnegie Mellon University: Lead annotator: Sue Holm Cross-claim relations: Yukari Yamakawa University of Illinois Urbana-Champaign: Revanth Gangi Reddy, Manling Li. 7.0 Annotations 7.1 Exhaustive Mention Annotation (TA1) The formats of annotations are described in the AIDA_phase_3_TA1_table_field_descriptions_v1.tab file in the docs directory; the sections below provide descriptions of the content of each type of annotation file. 7.1.1 Mentions A mention is a single reference in source data to a real-world entity or filler, event, or relation. A mention may occur in text, image, or video. A mention of an entity that takes part in an event or relations is called an argument. There are three mentions tables: one for entities and fillers, one for relations, and one for events. These tables contain information about each annotated mention. Unlike prior phases, in Phase 3, AIDA annotation ontology types, subtypes, and subsubtypes are not provided in data releases. In lieu of this information, qnodes are provided in new fields corresponding to mention subsubtype. These qnodes were derived from a mapping of LDC-internal annotation tags to the Wikidata qnodes produced by a working group of AIDA researchers. 7.1.2 Slots A slot is a pre-defined role in an event or relation that is filled by an argument (entity mention). There are two slots tables, one for relations and one for events. Relation and event mentions in the mentions tables must be looked up in the slots tables to find the arguments and fillers involved in the relation/event. Event mentions can occur as the arguments of other events, in addition to occurring as the arguments of relations. 7.1.3 KB Linking The TA1 KB linking table in this release provides within-document coreference of events, relations, and entities. Clusters of the same entity, relation, or event in different documents will have different NIL IDs since the coreference annotation is within-document only. For entities with a DWD "identity" qnode, that qnode is provided instead of a NIL ID. Additionally, each entity is assigned: a DWD qnode corresponding to the entity's fine-grained type; a DWD qnode for the entity's top-level base type; and a "generic" flag if/when an entity mention belongs to a generic entity cluster. Although NIL clustering is within-document, mentions assigned a qnode can be considered to have cross-document clustering. 7.2 Claim Frame Annotation (TA3) Claim Frame annotation consists of identifying claims that are made in the documents and labeling information about each claim. Claim frame annotation inlcudes information about what was claimed, what topic the claim relates to, who made the claim, any affiliations of the claimer, where and when the claim was made, as well as the polarity, certainty, and sentiment of each claim. In addition, Knowledge Elements that support each claim are annotated. Finally, claims are marked as either supporting, refuting, or related to other claims in the cross-claim relations stage of the task. The formats of annotations are described in the AIDA_phase_3_TA3_field_descriptions_v2_Qnode.tsv file in the docs directory, along with the guidelines for annotation; the sections below provide descriptions of the content of each type of annotation file. 7.2.1 Knowledge Elements Knowledge elements (KEs) are similar in form and purpose to mentions, with the exception that KEs are typically annotated only once per document, unlike mentions. For instance, whereas every mention of "United States" would be exhaustiely annotated for TA1 mention annotation, only one "United States" KE would be annotated for a given document for TA3 annotation. There are three KE tables which contain information about each annotated KE: arg_kes.tab -- contains entity and filler information rel_kes.tab -- contains relation information evt_kes.tab -- contains event information 7.2.2 Slots As with TA1 annotation, there are two slots tables, one for relations and one for events: rel_slots.tab -- relations evt_slots.tab -- events Relation and event mentions in the mentions tables must be looked up in the slots tables to find the arguments and fillers involved in the relation/event. 7.2.3 KB Linking The TA3 KB linking tables in this release provide within-document coreference of events, relations, and entities. Clusters of the same entity, relation, or event KE in different documents will have different NIL IDs since the coreference annotation is within-document only. For entities with an "identity" qnode, that qnode is provided instead of a NIL ID. Additionally, each entity is assigned: a qnode corresponding to the entity's fine-grained type (the most specific type qnode an annotator could find in Wikidata); a qnode for the entity's top-level base type (a broad type, such as "person" or "vehicle", that is the "inherent" entity type without reference to the document context); and a "generic" flag if/when an entity mention refers to a generic rather than specific entity. Although NIL clustering is within-document, mentions assigned a qnode can be considered to have cross-document clustering. 7.2.4 Claim Frames The claims tables contain each claim frame appearing in an annotated source document. Each unique claim is annotated once for a document, drawing on information from the document as a whole to complete the claim frame. Each claim frame annotation contains information about the claim itself, the claimer and their stance towards the claim, and other key information about the claim. In addition to claim frames and associated KEs, claim frame annotation also includes identifying relations between claims across multiple documents (identical, supporting, refuting, related). Cross-claim relations are annotated on a corpus-wide basis, based on all annotated claim frames for the corpus. Notably, unlike the KEs in the KB linking table, the NIL IDs that have been assigned to information within claim frames are cross-document IDs, meaning that two claim slot fillers with the same NIL ID can be considered coreferent, even when the claims were annotated from different documents. 7.2.5 Queries aida_phase_3_practice_topic_query_claim_frames.tab contains each claim frame selected for use as a query in the Phase 3 evaluation. Each claim frame was annotated once, drawing on information from the document as a whole to complete the claim frame. Query claim frames are identical in format to regularly annotated claim frames, and have comparable content that draws on information from the document as a whole. 8.0 Software tools included in this release 8.1 ltf2txt A data file in ltf.xml format (as described above) can be conditioned to recreate exactly the the "raw source data" text stream (the rsd.txt file) from which the LTF was created. The tools described here can be used to apply that conditioning, either to a directory or to a zip archive file containing ltf.xml data. In either case, the scripts validate each output rsd.txt stream by comparing its MD5 checksum against the reference MD5 checksum of the original rsd.txt file from which the LTF was created. (This reference checksum is stored as an attribute of the "DOC" element in the ltf.xml structure; there is also an attribute that stores the character count of the original rsd.txt file.) Each script contains user documentation as part of the script content; you can run "perldoc" to view the documentation as a typical unix man page, or you can simply view the script content directly by whatever means to read the documentation. Also, running either script without any command-line arguments will cause it to display a one-line synopsis of its usage, and then exit. ltf2rsd.perl -- convert ltf.xml files to rsd.txt (raw-source-data) ltfzip2rsd.perl -- extract and convert ltf.xml files from zip archives 9.0 Documentation included in this release The ./docs folder (relative to the root directory of this release) contains a tab-delimited table file, described in a subsection below. In the following, the term "asset" refers to any single "primary" data file of any given type. Each asset has a distinct 9-character identifier. If two or more files appear with the same 9-character file-ID, this means that they represent different forms or derivations created from the same, single primary data file (e.g. this is how we mark corresponding LTF.xml and PSM.xml file pairs). Data scouting, annotation and related metadata are all managed with regard to a set of "root" HTML pages (harvested by the LDC for a specified set of topics); therefore the tables and annotations make reference to the asset-IDs assigned to those root pages. However, the present release does not include the original HTML text streams, or any derived form of data corresponding to the full HTML content. As a result, the "root" asset-IDs cited in tables and annotations are not to be found among the inventory of data files presented in zip archives in the "./data" directory. Each root asset is associated with one or more "child" assets (inlcuding images, media files, style sheets, text data presented as ltf.xml, etc.); each child asset gets it own distinct 9-character ID. The root-child relations are provided in "parent_children.tab" table (7.1), and as part of the LDCC header content in the various "wrapped" data file formats (as listed in section 2). "parent_children.tab" -- relation of child assets to root HTML pages Each data file-ID in the set of zip archives is represented by the combination of child_uid and child_asset_type (columns 4 and 6). Col.# Content 1. parent_uid 2. child_uid 3. url 4. child_asset_type (e.g. ".jpg.ldcc") 5. topic ('n/a' for all rows to protect evaluation-sensitive information) 6. lang_id (automatically detected language, empty for non-ltf assets; see below) 7. lang_manual (manual language identification at the document level, where available 8. rel_pos (position of this asset relative to other child assets on page) 9. wrapped_md5 (md5 checksum of .ldcc formatted asset file) 10. unwrapped_md5 (md5 checksum of original asset data file) 11. download_date (download date of asset) 12. content_date (creation date of asset, or n/a) 13. status_in_corpus ("present" or "n/a" -- see below) Notes: - Because ltf and psm files have the same "child" uid and differ only in the file extension (.ltf.xml or .psm.xml), only the ltf files are listed in the parent_children.tab document. - The URL provided for each .ltf.xml entry in the table is the "full-page" URL for root document associated with the "parent_uid" value. (For other types of child data -- images and media -- the "url" field contains the specific url for that specific piece of content.) - Because the harvesting of some root URLs yielded no text content (hence no ltf/psm data files), the table includes "placeholder" .ltf.xml entries for those parent_uids, in order to provide the full-page URL for every root. The "status_in_corpus" field for these entries is set to "n/a" (as opposed to "present"). - Some child_uids (for images or videos) appear multiple times in the table, because they were found to occur identically in multiple root web pages. - When a file contains more than one language (according to either manual or automatic language-id), multiple language codes are presented as a comma-separated list. For some files, automated lang-id may yield a "lang_id" value of "und" meaning "undetermined" (as per ISO-639-3). AIDA_Phase_3_Exhaustive_Event_Relation_Annotation_V1.2.pdf - Annotation guidelines used for exhaustive annotation of event and relation mentions (including their arguments and attributes) AIDA_Phase_3_Exhaustive_Entity_Filler_Annotation_V1.1.pdf - Annotation guidelines for exhaustive annotation of entities and fillers AIDA_phase_3_TA1_table_field_descriptions_v1.tab - Description of the structure of each type of TA1 annotation table. This table includes information about column headers, content of each field, and format of the contents Annotation_Principles_for_Claim_Frame_Annotation_AIDA_Phase_3_TA3_V2.1.pdf - The version of the claim frame annotation principles document that was current at the time the claim frame annotations in this release were produced. AIDA_phase_3_TA3_field_descriptions_v2_Qnode.tsv - Description of the structure of each type of TA3 annotation table. This table includes information about column headers, content of each field, and format of the contents. AIDA_phase_3_TA3_cross-claim_relations_field_descriptions.tsv - Description of the cross-claim relations annotation table, including information about column headers, content of each field, and format. AIDA_Phase3_Practice_Topics_Templates.xlsx - Lists all Phase 3 practice topics/subtopics and their corresponding templates. ta1_annotation_doc_lang_topic.tab - Provides the root uid, language, and topic for each document with TA1 annotations present in this release ta3_annotation_doc_lang.tab - Provides the root uid and language for each document with TA3 annotations present in this release ta3_queries_doc_lang.tab - Provides the root uid and language for each document with TA3 queries present in this release CMU_README.txt - README provided by Language Technologies Institute, Carnegie Mellon University Colorado_README.txt - README provided by University of Colorado Boulder JHU_README.txt - README provided by Johns Hopkins University UIUC_README.txt - README provided by University of Illinois Urbana-Champaign 10.0 Known Issues 10.1 General Some parent documents annotated for this release may contain child assets which were deliberately excluded from annotation. This is true even when those unannotated images are children of a parent document that also has other child images that *are* annotated. 10.2 TA1 Annotations KB linking and coref annotation for TA1 are incomplete. The kb_linking.tab file contains coref and linking information for 49 of the 64 annotated source docs. A number of lines in the TA1 annotation files contain EMPTY_REF or EMPTY_TBD in place of ontology qnodes or general slot types. Certain LDC annotation tagset types and argument roles do not have a corresponding mapping to a Wikidata qnode or general argument role.. 10.3 TA3 Annotations JHU's TA3 annotations contain no relation annotations for any of the source documents annotated by JHU. As such, no rel_kes.tab or rel_slots.tab file is included in ./data/ta3_jhu CMU's TA3 annotations contain a number of entities marked only as "NIL" for their identity qnode. Actual NIL clustering IDs weren't provided in these cases, so LDC has left these as is. 11.0 Acknowledgement This material is based upon work supported by Air Force Research Laboratory (AFRL) and the Defense Advanced Research Projects Agency (DARPA) under Contract No. FA8750-18-C-0013. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of AFRL or DARPA. 11.0 Copyright Portions © 2020 20 Minutos Editora, SL, © 2021 24Hours, © 2021 64 parallel online, © 2020 102Neuve.com, © 2020 ABC News, © 2020 Adepa, © 2020 Advance Local Media LLC, © 2020 AFP, © 2021 AirTV Production LLC, © 2021 Al Jazeera Media Network, © 2021 Allen Media Broadcasting, © 2021 American Association for the Advancement of Science, © 2020 Amnesty International, © 2021 AMX Content SA de CV, © 2020 Atresmedia Corporación de Medios de Comunicación, SA, © 2020-2021 Autonomous Nonprofit Organization “TV-Novosti”, © 2020-2021 BBC, © 2021 BGR Media, LLC, © 2020 Bloomberg L.P., © 2021 BMJ Publishing Group Ltd, © 2020 BotaShqip, © 2020 Bulletin of the Atomic Scientists, © 2020 Cable News Network. A Warner Bros. Discovery Company., ©2021 CARACOL TELEVISIÓN SA, © 2021 CBS Interactive Inc., © 2020 Charter ’97 www.charter97.org, © 2021 China Digital Times, © 2020 CNBC LLC, © 2020 Coba Media LTD, © 2020-2021 Condé Nast, © 2021 Consumer Reports, Inc., © 2021 Daily Herald, © 2021 DIARIO AS, S.L., © 2020 DIARIO EL CORREO, S.A., © 2021 Diario Libre, © 2021 DIARIO NORTE, © 2021 Digital Alert, © 2020 EatingWell.com, © 2021 EDICIONES EL PAÍS, © 2021 Editorial Ecoprensa, S.A., © 2021 EDITORIAL UNIT INFORMACIÓN GENERAL, SLU, © 2020 Elcomercio.pe, © 2021 EL HERALDO SA, © 2021 El Independiente, © 2021 Elsevier Ltd., © 2020 euronews, © 2021 European Journalism Training Association, © 2020 FactCheck.org, © 2021 FAKEOFF, © 2021 Federal State Budgetary Institution "Editing Office of Rossiyskaya Gazeta", © 2020 First republican information and analytical portal “SakhaNews” (“News of Yakutia”), © 2021 FMNervion, SA, © 2020 Forbes Media LLC, © 2021 FOX News Network, LLC, © 2021 France 24, © 2021 Galvis Ramirez & Cia SA, © 2021 GlobalResearch.ca, © 2021 Global Times News Agency Co., Ltd., © 2021 GORDON, © 2021 Grupo La República Publicaciones SA, © 2020 Guardian News and Media Limited or its affiliated companies, © 2021 Healthline Media LLC, © 2021 Hearst Communications, Inc., © 2021 Hearst Magazine Media, Inc., ©2021 Hearst Television Inc. on behalf of KOCO-TV, © 2020 HindustanTimes, © 2021 iHeartMedia, Inc., © 2021 Imagen y Comunicación, © 2021 Independent.co.uk, © 2020 Information Agency "Znak", © 2021 Insider Inc., © 2021 InoSMI.ru, © 2020 Institut Pasteur, © 2020-2021 Interfax-Ukraine, © 2021 JSC Business News Media, © 2021 JSC Editorial office of the newspaper "Moskovsky Komsomolets" Electronic periodical "MK.ru", © 2021 JSC Kommersant, © 2020 Kenosha News, © 2021 KFF, © 2021 KQED Inc., © 2021 Kursk.com, © 2021 La Prensa, © 2021 LATINOAMÉRICA21.COM, © 2020 La Vanguardia Ediciones, SLU, © 2021 LA VOZ DE GALICIA SA, © 2020 Lead Stories LLC, © 2020 Lenta.Ru LLC, © 2021 LIVE24 LLC, © 2020-2021 Living Media India Limited, © 2020 LLC "BFM.RU", © 2021 LLC "Kurs", © 2020-2021 LLC "Network of city portals", © 2021 Los Angeles Times, © 2020 Martin’s Wellness, © 2021 Mayo Foundation for Medical Education and Research (MFMER), © 2021 Media Matters for America, © 2021 MediaNews Group, © 2021 Medical Xpress, © 2021 MedicoPlus, © 2021 MIA "DKNews", © 2021 Natural News Network, © 2021 NBC UNIVERSAL, © 2021 Network publication "Vesti.Ru", © 2020 Newsweek Digital LLC, © 2021 Nexstar Media Inc., © 2020 North-West Broadcasting LLC, TV-21 TV Company, Murmansk, © 2021 npr, © 2020 Observer Media Group, © 2021 OK!. A DIVISION OF EMPIRE MEDIA GROUP INC., © 2021 Omnia.com.mx, © 2020 Online publication "CentralAsia.news", © 2021 Online publication " Information Agency "RosBalt ", © 2021 People's Daily Online, © 2020 Poynter Institute, © 2021 Publicaciones Semana S.A., © 2021 Public Broadcasting Service (PBS), © 2021 Public Television, © 2021 Publishing House <Komsomolskaya Pravda> JSC, © 2021 Radio Free Asia, © 2021 Rambler, © 2021 RBA Revistas, S.L., © 2021 RealClearHoldings, LLC, © 2020 “REN TV Channel”, © 2020 Reuters, © 2021 RFE/RL, Inc., © 2020 Royal Pharmaceutical Society, © 2021 SA LA NACION, © 2021 SCIENTIFIC AMERICAN, A DIVISION OF SPRINGER NATURE AMERICA, INC., © 2020 SI “GazetaDaily.ru”, © 2021 Sierra Club, © 2021 Sinclair, Inc., © 2021 Snopes Media Group Inc., © 2020 Southern Baptist Convention, © 2021 Spanish Radio and Television Corporation, © 2021 SPH Media Limited, © 2020 Springer Nature Limited, © 2021 Sputnik, © 2021 Stars and Stripes, © 2020 STAT, © 2021 TASS News Agency, © 2021 Television news service, © 2020-2021 The Associated Press, © 2021 The Colorado Sun, © 2020 The Conversation US, © 2021 The Dallas Morning News, © 2020 The Indian Express [P] Ltd., © 2020 The News, © 2021 The New York Times Company, © 2021 The Northside Sun, © 2021 The Philadelphia Inquirer, LLC, © 2021 The Printers (Mysore) Private Limited, © 2021 The San Diego Union-Tribune, © 2020 The Sun, US, Inc, © 2021 The University of Texas MD Anderson Cancer Center, © 2021 The voice of the interior, © 2020 The Washington Post, © 2020 The Washington Times, LLC, © 2021 TIME USA, LLC, © 2020 Tododisca, © 2020 Toronto Star Newspapers Ltd., © 2020 TV Azteca, S.A.B. de C.V., © 2020 UKRAINIAN MEDIA HOUSE PUBLISHING LLC, © 2021 Ukrainian Truth, © 2021 UKRI, © 2020-2021 Ukrinform, © 2021 Univision Communications Inc., © 2020 USA TODAY, a division of Gannett Satellite Information Network, LLC, © 2020 Vera Files, © 2021 Vice Media Group, © 2021 vozpopuli.com, © 2021 WHYY, © 2021 WUSA-TV, © 2021 WWB Holdings, LLC, © 2021 XINHUANET.com, © 2021 ZDNET, A Red Ventures company, © 2020, 2021, 2024 Trustees of the University of Pennsylvania