Virginia Tech® home

Statement on the Responsible Use of Research Metrics

Virginia Tech Statement on Responsible Use of Research Metrics

This statement was endorsed and approved by the Virginia Tech Faculty Senate on April 21, 2023. It was then officially endorsed and approved at the university level on May 6, 2024 by the University Council and the University President per the Commission on Faculty Affairs Resolution CFA 2023-24E. See the acknowledgements for information about the authors of the statement. Alternatively, you can read the PDF of the statement.

Outlined below is Virginia Tech’s first statement on the responsible use of research metrics, based on the Leiden Manifesto for Research Metrics, but with an emphasis on issues specific to Virginia Tech.1 The Leiden Manifesto is an international response to the growing reliance on quantitative indicators used to evaluate and govern research, authored by experts in research policy, bibliometrics, and scientometrics. Our statement serves to provide support and guidance to the university community on implementing more inclusive and fair research assessment of faculty and other researchers or scholars, especially as it pertains to communicating explicit expectations during hiring, annual evaluations, and promotion and tenure, and to provide a unified voice from the Virginia Tech faculty on advocating a healthy, encouraging, and diverse research environment. 

1. Assessment of individual researchers should be based on a qualitative judgment of their portfolio, and quantitative metrics should support, not supplant, qualitative, expert assessment. 

Research, scholarship, and creative activities at Virginia Tech are diverse and wide ranging, and scholars come from different disciplines and are in different career stages, which all affect metric values. Virginia Tech acknowledges that quantitative metrics can help avoid potentially biased peer assessments during crucial review periods; however, such quantitative metrics should complement, not supplant, expert peer reviews of individuals, and can also be used at macro-level assessments, such as university strategic planning.  Quantitative metrics should not be abandoned in individual evaluation, but they should be used to inform or support it rather than relied upon as a sole means to determine an individual's career advancement, such as for promotion, tenure, or hiring decisions. Selection of appropriate and knowledgeable peer reviewers in the researcher’s field or area of research is crucial to ensuring a fair and balanced assessment process, something that can be challenging for those in interdisciplinary and transdisciplinary research (IDR/TDR).2 Selection of reviewers should be made with input from the person being assessed, as well as by at least one faculty member familiar with the work being assessed. Departments are strongly encouraged to clearly communicate, preferably in writing, expectations of scholarly productivity, metrics used to assess scholarship, and how those metrics will be used.3 Such written communication should be a part of formal departmental expectation documents all units are required to have, but they can also be included in other formal faculty reviews. We recommend flexibility rather than strict measures, especially to accommodate faculty publishing in niche fields and in IDR/TDR.

2. Measure performance against the research missions or values of the institution, group, and/or researcher. 

Goals and missions of individual departments or units should be clearly stated and communicated before selecting quantitative metrics for use in individual assessments, which should be selected carefully and align directly to those goals; departments and units should also carefully evaluate the metrics for relevance and accuracy in how they can be practically applied to their mission, values, and/or goals. Such missions, goals, and metrics should be department- or unit-driven to reflect discipline-specific needs and values while allowing adaptability for those producing IDR/TDR4 or other scholarship or creative projects that may not directly align with departmental goals and values, which is crucial for ensuring academic freedom. Communicating expectations with researchers who may not align with discipline- or department-specific goals and expectations is crucial to ensuring success in their academic career.

3. Protect excellence in locally relevant and community-engaged research.

Faculty members and researchers are often assessed based on the journals or venues in which they publish, which are typically rated based on their national or international reputations and/or impact factors. However, research on regional or local issues may be published in local journals, government reports, and at local exhibitions or venues. If departments or individuals value locally relevant and community-engaged research, they should take the initiative to foster growth of faculty members interested in pursuing such research and projects.5

4. Keep data collection and analytical processes open, transparent, and simple. 

Transparency in evaluation processes is necessary to ensure researchers have fair and equal opportunities of attaining career aspirations when they communicate their work and accomplishments during reviews. Departments should be forthright about the requirements within their expectations documents.6 Written expectations should strike a balance between the vague and the overly explicit. Furthermore, metrics or numbers should be used within the context of the trajectory of the faculty member’s progress. There are some exceptions in which transparency may not be possible, such as confidential reviews between the faculty member, supervisor, Dean, P&T committee, and so on, but the processes and expectations, including where and when metrics are used, should be communicated to researchers.

5. Allow those evaluated to verify data and analysis.

Publication, citation, awards, and grant data used to monitor faculty activity at the university should always be made openly available, such as the data used to track progress on strategic metrics. Departments must acknowledge that if they do not provide certain data, such as publication data, crucial faculty research activities may be left out of the university’s monitoring and reward systems. In addition, the university should provide education on how to collect and provide such data, and units should plan to budget for funds and staff time towards supporting data collection and research monitoring. 

6. Account for variation by field in publication and citation practices as well as the age of the output(s) being evaluated.

Junior faculty members, whose evaluations occur within a narrow band of time, will generally have fewer publications and citations than senior researchers. Because of this, citation metrics tend to favor senior researchers, as well as those in the STEM-H fields.7 In addition, university-level metrics tend to disincentivize long-term projects, such as books,8 which are more likely to be produced by humanities and arts scholars.9

In the short-term, faculty should be able to demonstrate the impact of such scholarship through alternative and/or qualitative measures. In the long-term, the university and departments should actively cultivate a healthy research environment that invites and encourages potentially risky, novel, and/or IDR/TDR projects by not over-emphasizing short-term measures, keeping open minds, and allowing faculty to fully explain their projects.10

7. Avoid misplaced concreteness and false precision. 

Virginia Tech is committed to using a wide range of indicators, whether at the micro-level (individual researchers) or the macro-level (e.g., departments, schools, colleges, the university) to ensure that indicators are interpreted with nuance and care. Quantitative indicators designed to measure research impact are prone to conceptual ambiguity, though the data providers that produce these metrics tend to suggest precision and concreteness. However, citation indicator values vary widely based on randomness and conceptual ambiguity; thus, citation indicator values should be used warily, only be calculated to the first decimal place, and the sample size should be considered when assigning meaning or weight to an indicator.11

8. Recognize the systemic effects of assessment and indicators.

Virginia Tech recognizes that pursuing goals through the use of research metrics can and does invite gaming and manipulation of such metrics, which thus becomes less meaningful, a concept known as Goodhart’s Law.12 Thus, it is imperative to have open and explicit discussions about the ways in which metrics are used at all levels to improve transparency and trust between administrators and individual researchers, and to insure that specific research indicators are not linked to resource distribution.13

9. Scrutinize indicators regularly and update them.

Virginia Tech recognizes that goals, missions, and visions shift over the years. In addition, indicators that are trusted and relied upon today may become obsolete or less meaningful or useful. University-level indicators used in the university’s strategic plan and for strategic decision-making should be revisited and regularly updated in accordance with the university’s values, mission, and goals. 


[1] Virginia Tech has achieved significant research milestones since the establishment of the Beyond Boundaries framework in 2015. Both the Beyond Boundaries vision and the University’s Strategic Plan identify milestones and metrics to chart progress and inform decisions. However, the metrics used to assess progress to meet university-level priorities do not always value or incentivize the production of scholarship and creative works outside large commercial databases. Works such as installations at architectural exhibitions and impactful collaborations with communities to tackle pressing challenges are often excluded. Even relatively conventional publications such as journal articles and books can be excluded. We recognize that research metrics are not necessarily used in isolation to inform the university on the overall quality of research and creative activity, nor for strategic decision-making. However, we stress that maintaining balance in the fair assessment of individual researchers and overarching university goals is crucial to a healthy research ecosystem inside the university as well as more broadly across academia. We believe it essential that the university should commit to a fair, balanced, and sensible approach to the use of research metrics both at an individual and institutional level. This statement focuses on an inclusive spirit with respect to all forms and mediums of research, scholarship, and creative works, so long as the work is high quality according to expert peer assessment.

[Return to top of page]

[2] Although Virginia Tech has committed itself to increasing IDR/TDR, the incentives to pursue TDR are not entirely clear or present. For instance, many departments and colleges require the reporting of impact factors or require publication in specific disciplinary journals or venues, which may not be as accepting of IDR/TDR. IDR/TDR journals are still emerging and take time to build their reputations and impact factors. Virginia Tech does promote the importance and value of IDR/TDR, but evaluations of such research need to be more flexible. It also takes time for faculty members who produce IDR/TDR to find qualified external reviewers; nonetheless, such reviewers are critical for a fair and qualitative evaluation process.

[Return to Principle 1]

[3] For example, a department may require or strongly encourage publication in a list of journals for the purposes of obtaining tenure, but this should be communicated to faculty members early on and maintained consistently throughout their two-, four-, and six-year reviews. A suite of metrics should be allowed for use in evaluations rather than a department requiring or relying heavily or exclusively on an individual metric for performance evaluation (e.g., impact factors, h-index, citation counts, publication counts). In addition, alternative metrics could be used to demonstrate impact outside academia, such as mentions in public policy documents, news media, and patents, but such mentions, as with any metric, should be contextualized.

[Return to Principle 1]

[4] With a land-grant mission stating that it is “dedicated to improving the quality of life and the human condition within the Commonwealth of Virginia and throughout the world,” additional indicators can be selected at the department or unit level to demonstrate the improvement of life and the human condition within the state and/or throughout the world. For example, Virginia Cooperative Extension publishes numerous fact sheets, eBooks, reports, curriculum materials, and more every year that directly benefit local Virginia communities in areas of farming, gardening, youth development, the environment, and others; besides usage statistics of the reports, there are few if any quantitative metrics to show the “impact” of these publications. Rather, qualitative assessment, such as engaging with members of the community to gauge how the reports have improved their quality of life, is more appropriate. Despite the value of qualitative assessment, it is time-intensive for evaluators and university leaders; thus, alternatives to reviewing a researcher’s entire body of work can be considered, such as asking them to select a few works or accomplishments that are representative of their research as well as the careful selection of external reviewers who are qualified to review their work.

[Return to Principle 2]

[5] Local research and scholarship can be of high quality even when not broadly transferable, generalizable, or internationally recognized; it can have significant impacts on local communities and therefore should be encouraged, incentivized, and valued. For example, project reports such as Youth Risk Behavior Evaluations culminate in unpublished reports for communities, who are in a good position to use the reports to procure funding for programs. Some departments might view the report(s) as an extension of community service but many of these reports are quite comprehensive, of exceptional quality, and a case can be made that the reports are scholarship.

[Return to Principle 3]

[6] For example, in a closed evaluation system, candidates for promotion and tenure may not know which indicators and expectations are used to evaluate their performance, which allows for speculation, uncertainty, and potentially unfair treatment. In addition, if expectations are general or vague, they can be misconstrued or misinterpreted, whereas explicit expectations can be limiting but provide more clarity.

[Return to Principle 4]

[7] For example, the author h-index can never decrease, even in the absence of new publications. In addition, the h-index, citation counts, and publication counts are dependent upon the field in which researchers publish as well as its data source. Those in the life sciences tend to have the highest citation counts and publication counts, and thus, they also have the highest h-indices (which are dependent upon both). Fields across the social sciences, arts, and humanities do not tend to produce as many or any journal articles, which are much more common outputs in the STEM-H fields. Instead, these fields may focus more on books, monographs, exhibits, performances, and others, including less traditional forms of scholarship, such as digital humanities projects; these types of scholarship are not well-indexed in commercial bibliographic databases, if included at all. Therefore, some of the more traditional research metrics, such as publication counts, citation counts, impact factors, and the h-index, may not be appropriate indicators for evaluating individuals in these fields. Those who publish single-authored outputs may also be disadvantaged by international collaboration metrics and expectations (also see Principle 3 regarding locally relevant research). Experts in bibliometrics should be consulted when the value or meaning of certain citation indicators are unclear or ambiguous.

[Return to Principle 6]

[8] An historical monograph, for instance, can take five to seven years from start to publication. The citation lifespans also differ across fields, with citations taking approximately two to three years to meet their peak in many STEM-H fields while taking as long as a decade to reach their peak in the arts and humanities, especially for long-term projects.

[Return to Principle 6]

[9] For example, the Virginia Tech Strategic Plan lists specific milestones to advance regional, national, and global impact, all of which are short-term (two to five years) and naturally incentivize colleges to value and encourage production of certain scholarly works over others, especially those that affect the rankings, which only include publications indexed in commercial databases. There are other research metrics included in the SP as well as on the SP Metrics Dashboard outside of publications and citations, such as awards, invention disclosures, license agreements, and start-up companies created from VT research. The metrics and milestones used to measure scholarly productivity and impact are exclusively from commercial databases and focus on short-term measures, which can create a trickle-down effect on the assessment and hiring of individual researchers. In addition, Virginia Tech is now using publication and citation data from Academic Analytics to allocate a small portion of funds to colleges. University leaders play a key role in how measures are used on the macro-level, how they should or could be used on the meso-level (college and department level), and how they can potentially affect individuals. Even in well-designed systems in which leaders communicate the limitations of metrics on the individual level, such uses, especially when tied to monitoring and rewarding colleges or units through funding allocations, can have unintended consequences. From a study on the effects of a national bibliometric system: “it takes considerable effort from both system designers and institutional leaders to prevent these types of quantitative measures to trickling down and affecting local management practices and ultimately individual behaviour in unintended ways. Explicit and open discussions on the ways in which the indicator is used at all levels are required if uncertainty is to be reduced and if unintended effects are to be minimized” (Aagaard, 2015). For university leaders, the allure of readily available metrics to measure performance can tempt them to overemphasize them while simultaneously creating uncertainty and anxiety for individuals who hear contradictions between what administrator and the reality.

[Return to Principle 6]

[10] Even commercially available field-normalized citation metrics are flawed, because the mean is used to characterize the distribution of citations, and therefore cannot always be relied upon, especially in small sample sizes, to demonstrate citation impact. To complicate matters further, field-weighted citation metrics, such as Elsevier’s Field-Weighted Citation Impact (FWCI) metric, carry their own problems. Citation counts usually have a skewed distribution, which leads to an inflated average, whereas percentiles will correct for outliers. Therefore, percentiles should be used instead of averages where possible, and when not possible, outliers should be identified to better interpret the data. Also see this academic blog post on FWCI as it relates to sample sizes. Evaluators should also be aware that novel research and IDR/TDR tends to have higher impact in the long term when successful, but citation growth takes longer than more traditional, disciplinary research.  There is a distinction between novel research and IDR/TDR, though there is overlap. Novel research delves into unexplored topics and areas, and it is typically within the realm of IDR/TDR. IDR/TDR is not always novel research (i.e., it is not always groundbreaking when scholars collaborate across disciplines), though it is still newer and less explored than disciplinary research. Novel research also has a significant impact on advancing IDR/TDR, and it is extraordinarily difficult for novel research to be accepted and published within disciplinary borders. In addition, the combination of certain fields tend to produce more novel or ground-breaking research than others. See Wang et al., 2017 for more insight into the citation bias against novel research.  Part of this may be explained by the intellectual challenges of taking on IDR/TDR projects, finding and establishing collaborations across fields, and lengthier submission and review processes with publication venues. See Leahy et al., 2017.  In addition, pursuing novel and/or IDR/TDR is riskier, which has been shown through greater variance in citation counts, lower funding success, publication in lower-impact journals, delayed recognition (i.e., higher impact in the long-term, but citation growth takes longer), and lower productivity (i.e., fewer publications).

[Return to Principle 6]

[11] For example, it is statistically unnecessary to calculate the journal impact factor to three decimal places, and the data provider (Clarivate Analytics) insists it must do this in order to accurately rank the journals. However, this is not a justification and is unwarranted, especially for evaluation of journals. The same can be said for the field-weighted citation indicator (FWCI) from Elsevier; it is also not recommended to be used for sample sizes fewer than ten-thousand due to its sensitivity to outliers. Yet, Scopus and SciVal provide the FWCI for individual outputs, researchers, and groups regardless of the sample size.

[Return to Principle 7]

[12] Goodhart’s Law can be summarized as: “When a measure becomes a target, it ceases to be a good measure.” See Fire & Guestrin, 2019 or the corresponding blog post for details on how Goodhart’s Law can be seen in academic publishing. In the UK, some institutions actively sought to hire researchers with established publication records directly leading up to the country’s national research institution assessment, the Research Excellence Framework (completed every seven years), rendering the REF’s publication and citation measures less meaningful. Such effects should be anticipated; to mitigate this, a suite of measures should be used, which can complement one another and lend to more nuanced interpretation. Further, directly incentivizing or rewarding scholarship with funding can be harmful to individual researchers and their career paths (see Aagaard, 2015aAagaard, 2015b; and de Rijcke, 2015), depending on their career stage and their research area or field.

[Return to Principle 8]

[13]  Please see The SCOPE Model Framework, page 12, “The evaluation impact matrix” for more context around how the purposes behind evaluation can affect certain entities more than others. In addition, please see how the European Union and the UK are gradually moving away from metrics-based assessment: “Grants and hiring: will impact factors and h-indices be scrapped?” and the EUA Agreement on Reforming Research Assessment.

[Return to Principle 8]



The Responsible Research Assessment Task Force acknowledges the previous work on responsible research assessment at the university: 

Members of the task force who researched other responsible research assessment statements and drafted this statement:

  • Rachel Miles (UL, Research Impact & Intelligence), Chair
  • Jim Hawdon (CLAHS, Sociology)
  • Jim Kuypers (CLAHS, Communications)
  • Kim Niewolny (CALS, Agricultural, Leadership, and Community Education)
  • Ico Bukvic (CLAHS, School of Performing Arts)
  • Bikrum Singh Gill (CLAHS, Political Science)
  • Anita Kablinger (VTCSOM, Psychiatry)
  • Rachel Lin Weaver (CAUS, School of Visual Arts)
  • Carla Finkielstein (COS, Biological Sciences)
  • Leigh-Anne Krometis (COE, BSE)
  • Joe Merola (COS, Chemistry)
  • Todd Schenk (CAUS, School of Public and International Affairs)
  • James (Jim) Fraser (CNRE, Fish and Wildlife Conservation)
  • Kerry Redican (VMCVM, Population Health Sciences)
  • Mike Horning (CLAHS, Communications)
  • Viswanath “Venki” Venkatesh (Pamplin, Business Information Technology)



Aagaard, K. (2015a). How incentives trickle down: Local use of a national bibliometric indicator system. Science and Public Policy, 42(5), 725–737.

Aagaard, K. (2015b). How incentives trickle down: Local use of a national bibliometric indicator system. Science and Public Policy, 42(5), 725–737.

Aagaard, K., Bloch, C., & Schneider, J. W. (2015). Impacts of performance-based research funding systems: The case of the Norwegian Publication Indicator. Research Evaluation, 24(2), 106–117.

Beyond Boundaries Steering Committee. (2016). ENVISIONING VIRGINIA TECH BEYOND BOUNDARIES: A 2047 Vision: A framework prepared by Beyond Boundaries participants. Virginia Tech.

European University Association. (2022). European University Association (EUA) Agreement on Reforming Research Assessment. European University Association.

Faculty Perceptions of Research Assessment at Virginia Tech—Journal of Altmetrics. (n.d.). Retrieved February 21, 2023, from

Fire, M. (2019, June 2). Goodhart’s Law: Are Academic Metrics Being Gamed? The Gradient.

Fire, M., & Guestrin, C. (2019). Over-optimization of academic publishing metrics: Observing Goodhart’s Law in action. GigaScience, 8(6), giz053.

Hawdon, J., Heaton, M., House, L., Miles, R., Pannabecker, V., Pleasant, R., & Wokutch, R. (2020). Recommendations for Assessing Faculty: A Faculty Senate Subcommittee Report. Virginia Tech.

Hicks, D., Wouters, P., Waltman, L., de Rijcke, S., & Rafols, I. (2015). Bibliometrics: The Leiden Manifesto for research metrics. Nature, 520(7548), Article 7548.

INORMS Research Evaluation Group. (2021). The SCOPE Framework A five-stage process for evaluating research responsibly. INORMS.

Jump, P. (2013, September 26). Twenty per cent contracts rise in run-up to REF. Times Higher Education (THE).

Kuypers, J., Westwood, J., Wong, E., Houptman, J., Meany, K., Knapp, B., Viehland, D. D., Hicok, B., Leonard, R. H., Woods, C., Mielczarek, N., Dayer, A., Thomas, Q., Merola, J., Redican, K. J., McMillan, G., Miles, R., Pannabecker, V., Porter, N. D., & Macdonald, A. B. (2019). Virginia Tech Faculty Senate Research Assessment Committee’s June 2019 Report on Faculty Research and Salary (Board of Visitors Meeting Minutes, pp. 12–107). Virginia Tech.

Leahey, E., Beckman, C. M., & Stanko, T. L. (2016). Prominent but Less Productive: The Impact of Interdisciplinarity on Scientists’ Research. Administrative Science Quarterly, 62(1).

Miles, R. A., MacDonald, A. B., Porter, N. D., Pannabecker, V., & Kuypers, J. A. (2019a). What Do Faculty Think About Researcher Profiles, Metrics, and Fair Research Assessment? A Case Study from a Research University in the Southeastern United States.

Miles, R. A., MacDonald, A., Porter, N. D., Pannabecker, V., & Kuypers, J. A. (2019b, September 5). What Do Faculty Think About Researcher Profiles, Metrics, and Fair Research Assessment? A Case Study from a Research University in the Southeastern United States [Panel Presentation]. 10th Annual VIVO Conference, Podgorica, Montenegro.

Miles, R., Pannabecker, V., & Kuypers, J. A. (2020). Faculty Perceptions of Research Assessment at Virginia Tech. Journal of Altmetrics, 8(1).

Miles, R., Pannabecker, V., MacDonald, A., Kuypers, J., & Porter, N. D. (2019, October 9). Faculty Perceptions on Research Impact Metrics, Researcher Profile Systems, Fairness of Research Evaluation, and Time Allocations [Panel Presentation]. 6:AM Altmetrics Conference, Stirling, UK.

Rijcke, S. de, Wouters, P. F., Rushforth, A. D., Franssen, T. P., & Hammarfelt, B. (2016). Evaluation practices and effects of indicator use—A literature review. Research Evaluation, 25(2), 161–169.

Rowlands, I. (2017, May 11). SciVal’s Field weighted citation impact: Sample size matters! – The Bibliomagician. The Bibliomagician.

Sands, T. (2018, January 16). Beyond Boundaries Implementation. The Office of the President.

Strategic Planning Metrics Subcommittee. (2018). Metrics White Paper: On the Design and Use of Metrics. Virginia Tech.

Virginia Tech. (2022). The Virginia Tech Difference: Advancing Beyond Boundaries. Office for Strategic Affairs.

Virginia Tech Office for Stategic Affairs. (2020). THE VIRGINIA TECH DIFFERENCE ADVANCING BEYOND BOUNDARIES Strategic Plan. Virginia Tech.

Wang, J., Veugelers, R., & Stephan, P. (2017). Bias against novelty in science: A cautionary tale for users of bibliometric indicators. Research Policy, 46(8), 1416–1436.

Woolston, C. (2022). Grants and hiring: Will impact factors and h-indices be scrapped? Nature.