Trees and Crops for the Future - international publications 2013-2024

Ver. 2 - Feb. 2025


1. Introduction


This page presents a bibliometric overview of publications related to Trees and Crops for the Future (TC4F). A manually compiled and curated publication list provided by TC4F has been matched to Scopusa,b/SciVal to facilitate analysis of citation impact, internationalization, and collaboration. In addition, subject partitioning derived by cluster analysis is available, enabling topic-level bibliometrics.

Publication-level data is available for download for validation and ad-hoc analysis. A total of 752 publications have been identified for the considered publication period.

1.1 Indicators


Two indicators are used to highlight citation impact and international collaboration, but many more are availiable when downloading the data. Here we focus on:


1.2 Implementation Aspects

The source of the above indicators is SciVal (based on Scopus data). When "subject areas" are used in the normalization process of the Top 10%-indicator described above, it refers to All Science Journal Classification (ASJC) categories. This classification system operates at the journal level. It is used here to provide a relatively coarse-grained subject normalization of the indicator. Furthermore, the observed values of the indicators are usually supplemented with so-called stability intervals (SI). These intervals are created by repeatedly and randomly selecting 90% of the publications without replacement and calculating the indicator value for the given subset. A 90% stability interval is then constructed by setting the lower (upper) limit to the 5th (95th) percentile in the resulting distribution.

When 'organizational units' constitute a variable, it is important to note that such data in its original form is not standardized. Throughout these exercises, organization disambiguation algorithms are used for identification and standardization. These are provided by a third party (Elsevier), and no manual corrections have been attempted. As is generally the case for bibliometric exercises, the focus should be on the overall structure that emerges. Further information about these disambiguation systems can be found here and here.

A note on the citation window: as the considered publication period is 2013-2024, citation metrics (in contrast to collaboration metrics, for example) for publications published during the last year convey little information. The potential number of received citations is usually very low, and we mostly model noise with an indicator like Top 10%. In addition, 2024 is not a complete year in a bibliographic sense at the start of 2025 (some publications are not yet indexed or might change the final publication year as the publication progresses through publication stages). All in all, interpret information for the latest year with caution or ignore it altogether.

1.3 Subject Partitioningc


Even though the main level of analysis operates on the aggregated level of the full publication set, a tool for investigating publication volume, citation impact, and collaboration by subject matter is also available. Briefly, we cluster the publications based on their pairwise similarities, where similarity is based on the cited references and word usage in the publications (so-called "hybrid similarity"; see more here). The goal is to highlight the publication population's different research topics or themes. A technical description of the approach is available here and here.

After the publications have been clustered with respect to subject matter, overall descriptions or labels of the clusters are automatically created by analyzing the publications on a cluster-by-cluster basis with a large language model. Also, ASJC codes that are representative of the given cluster (but less so for others) and examples of frequent authors are extracted to help interpret the cluster's subject composition.

Note that an approach like this, which is entirely data-driven and combines bibliometric-based operationalization of subject similarity with machine learning and large language models, presents one view of the publication set. It should be clear that subject partitioning derived like this is naturally and necessarily partly a result of the current data model (how similarity between publications is operationalized), the choice of clustering method, and its associated free parameters (that, e.g., guide the resolution of the partition). While keeping this in mind, it can be a powerful analytic tool to help answer ad-hoc questions regarding the publication set.

The 752 publications have been grouped into 13 clusters.

2. Time Series - publication, citation impact, and collaboration


Figure 1 shows the number of publications per year. Note that some of the variation in publications per year (especially for the earliest year) is probably an artifact of how the original publication list was compiled.

In total, the publications have received more than 41000 citations (up until February 2025).

Figure 1. Number of publication per year and the cumulative number of citations recieved. * incomplete bibliographic year.


Figure 2 shows the share of publications among the 10% highest cited in the database after controlling for subject area, publication type, and year. A slight decrease might be observed in the second half of the period. However, it is still well above the expected value of 10%.

Figure 2. Citation impact - top 10% ("World" baseline). * incomplete bibliographic year.


As shown in Figure 3, international collaboration is high and varies around 60%, with a tendency to an upward trend.

Figure 3. Share of international collaborative publications. * incomplete bibliographic year.


3. Subject Partitioning


A subject partition consisting of 13 clusters has been identified. Several different researched themes, such as 'Forest Tree Genomics', 'Sustainable Forest Management' and 'Biopolymer Material Science' emerge. Note that Table 1, complementing the visualization below, is interactive, and it is possible to study the available indicators (citation impact operationalized with the Top 10% indicator or the raw number of citations and degree of international collaboration) at the cluster level.

Figure 4. Switch to full-screen mode by clicking the square in the upper right corner. Click on a node (publication) to get bibliographic information as well as Field-weighted Citation Impact (FWCI) and a link to the bibliographic record in Scopus. The control panel can be accessed by clicking the arrow found on the left edge. Zoom in and out using the buttons in the middle right (or use the scroll wheel on your mouse).


Table 1 complements the visualization in Figure 4.

Table 1. Subject partition - 13 clusters. Label, subfield, summary and keywords are extracted with the help of a Large Langue Model without manual intervention.

4. Collaborative Units


The organizational units involved in at least 15 unique publications have been identified (n=20). The number of co-authored publications is used to operationalize the degree of collaboration. SLU and UmU take on the role of two "hubs" in the network. Although much collaboration occurs between these two institutions, each can be seen as the largest node within its respective cluster. Further, the collaborative patterns of INNVENTIA, KTH, and Lunds University within the publication set are such that they form a third distinct cluster.

Figure 5. Switch to full-screen mode by clicking the square in the upper right corner. Click on a node (university/institution) show the number of publications attributed to the unit.


Table 2 complements the visualization in Figure 4.

Table 2. Units with at least 15 publications. Grouped by collaborative pattern (clustering).
Unit Cluster PUB
Sveriges lantbruksuniversitet (Sweden) 1 627
Skogforsk (Sweden) 1 63
Stockholms universitet (Sweden) 1 23
Commonwealth Scientific and Industrial Research Organisation (Australia) 1 23
Natural Resources Institute Finland (Luke) (Finland) 1 23
Duke University (United States) 1 23
Københavns Universitet (Denmark) 1 22
Helsingin Yliopisto (Finland) 1 21
Göteborgs Universitet (Sweden) 1 20
Asian School of the Environment (Singapore) 1 18
Umeå Universitet (Sweden) 2 193
Umeå Plant Science Centre (Sweden) 2 47
Beijing Forestry University (China) 2 31
The University of British Columbia (Canada) 2 24
Uppsala Universitet (Sweden) 2 21
Universiteit Gent (Belgium) 2 21
Norges Miljø- og Biovitenskapelige Universitet (Norway) 2 15
The Royal Institute of Technology (KTH) (Sweden) 3 60
Lunds Universitet (Sweden) 3 36
INNVENTIA (Sweden) 3 20

a. The following publication types are considered in all analyses: Article, Conference Paper, Review, Chapter, Book, Short Survey, and Data Paper. The following types are not included in the data: Editorial, Erratum, Retracted, Book reviews, and Abstract Report/Conference meeting abstracts.

b. See the Scopus Content Coverage Guide for a detailed description of Scopus content.

c. In the utilized clustering method ("multiresolution modularity" maximized with the so-called Leiden algorithm), γ controls the resolution ("number of clusters") of the partitioning. The problem now is to identify one or several reasonable values for γ. Here, we have followed an approach that creates a range of cluster solutions by systematically varying the value of γ and identifying regions where the cluster solutions are "stable," i.e., where small or no differences between the cluster solutions exist (Lambiotte, 2010). Stability is taken as a signal that the cluster solution is "robust" and, therefore, likely more interesting to focus on than cluster solutions sensitive to small changes in γ. It is easy to see that a hierarchy of cluster solutions should reasonably exist from thematically coarse-grained to very specific. In practice, one or several cluster solutions are chosen based on a combination of the stability criterion and a more subjective criterion stipulating that the solution should be "useful" for the purpose for which it was developed (here, essentially, "not too low resolution (few clusters) but not so high resolution that the benefit of automatically summarizing the publication volume is lost").

Cristian Colliander & Cecilia Sandberg @ UmUB - 2025-02-26.