19May

What Increasing Compute Efficiency Means for the Proliferation of Dangerous Capabilities


This blog post summarises the recent working paper “Increased Compute Efficiency and the Diffusion of AI Capabilities” by Konstantin Pilz, Lennart Heim, and Nicholas Brown.

GovAI research blog posts represent the views of their authors rather than the views of the organisation.


Introduction 

The compute needed to train an AI model to a certain performance gets cheaper over time. In 2017, training an image classifier to 93% accuracy on ImageNet cost over $1,000. In 2021, it cost only $5 — a reduction of over 99%. We describe this decline in cost — driven by both hardware and software improvements — as an improvement in compute efficiency.

One implication of these falling costs is that AI capabilities tend to diffuse over time, even if leaders in AI choose not to share their models. Once a large compute investor develops a new AI capability, there will usually be only a short window — a critical period before many lower-resource groups can reproduce the same capability.

However, this does not imply that large compute investors will have their leads erode. Compute efficiency improvements also allow them to develop new capabilities more quickly than they otherwise would. Therefore, they may push the frontier forward more quickly than low-resource groups can catch up.

Governments will need to account for these implications of falling costs. First, since falling costs will tend to drive diffusion, governments will need to prepare for a world where dangerous AI capabilities are widely available — for instance, by developing defenses against harmful AI models. In some cases, it may also be rational for governments to try to “buy time,” including by limiting irresponsible actors’ access to compute.

Second, since leading companies will still tend to develop new capabilities first, governments will still need to apply particularly strong oversight to leading companies. It will be particularly important that these companies share information about their AI models, evaluate their models for emerging risks, adopt good information security practices, and — in general — make responsible development and release decisions.

The causes of falling costs 

Falling training costs stem from improvements in two key areas:

  1. Advances in hardware price performance as predicted by Moore’s Law — increase the number of computational operations that a dollar can buy. Between 2006 and 2021, the price performance of AI hardware doubled approximately every two years.
  2. Advances in algorithmic efficiency decrease the number of computational operations needed to train an AI model to a given level of performance. For example, between 2012 and 2022, advances in image recognition algorithms halved the compute required to achieve 93% classification accuracy on the ImageNet dataset every nine months.

To capture the combined impact of these factors, we introduce the concept of compute investment efficiency — abbreviated to compute efficiency — which describes how efficiently investments in training compute can be converted into AI capabilities. Compute efficiency determines the AI model performance1 available with a given level of training compute investment, provided the actor also has sufficient training data (see Figure 1).

Figure 1: Compute (investment) efficiency is the relationship between training compute investment and AI model performance.

Access and performance effects

Based on our model, we observe that increasing compute efficiency has two main effects:2

  • An access effect: Over time, access to a given level of performance requires less compute investment (see Figure 2, red).
  • A performance effect: Over time, a given level of compute investment enables increased performance (see Figure 2, blue).
Figure 2: Compute efficiency improves between time t = 0 and t = 1, causing an access effect (red) and a performance effect (blue).³

If actors experience the same compute efficiency improvements, then these effects have the following consequences:4

Capabilities diffuse over time. Due to the access effect, the investment required to reach a given performance level decreases over time, giving an increased number of actors the ability to reproduce capabilities previously restricted to large compute investors.

Large compute investors remain at the frontier. Since large compute investors achieve the highest performance levels, they are still the first to discover new model capabilities5 that allow novel use cases. Absent a ceiling on absolute performance, those actors also will continue to demonstrate the highest level of performance in existing capabilities.

The emergence and proliferation of dangerous capabilities

Future AI models could eventually show new dangerous capabilities, such as exploiting cybersecurity vulnerabilities, aiding bioweapon development, or evading human control. We now explore the discovery and proliferation of dangerous capabilities as compute efficiency increases. 

Figure 3: Illustration of the emergence and proliferation of dangerous capabilities across three actors. The large compute investor first achieves dangerous capability x at time t = 1. When the secondary actor (such as a small company) reaches dangerous capability X at t = 2, the large compute investor has already achieved the even more dangerous capability Y.

Dangerous capabilities first appear in models trained by large compute investors. Since dangerous capabilities require high levels of performance, large compute investors likely encounter them first.

These dangerous capabilities then proliferate over time, even if large compute investors limit access to their models. As compute efficiency improves, more actors can train models with dangerous capabilities. The dangerous capabilities can therefore proliferate even when large compute providers provide only limited or structured access to their models. This proliferation increases the chance of misuse and accidents.

Defensive tools based on leading models could potentially increase resilience against these dangerous capabilities. To counteract harm caused by weaker models, large compute investors may be able to use their more advanced models to create defensive tools.7 For example, cybersecurity tools powered by advanced models could find vulnerabilities before weaker models can exploit them. However, some domains, such as biotechnology, may greatly favor the offense, making it difficult to defend against dangerous deployments even with superior models.

Governance implications

Oversight of large compute investors can help to address the most severe risks, at least for a time. If the most severe risks from AI development originate from the most capable models and their large-scale deployment, then regulating large-scale compute users can — at least for a time — address the most severe risks. For instance, governments can require developers of large-scale models to perform dangerous capability evaluations and risk assessments, report concerning results, and use the results to make responsible release decisions.  Governments can also encourage or require developers to implement good information security practices to prevent their models from leaking or being stolen. Furthermore, governments can develop the capability to quickly detect and intervene when models created by these developers cause harm.

Large compute investors should warn governments and help them prepare for the proliferation of advanced capabilities. The effectiveness of societal measures to mitigate harm from proliferation hinges on the time that passes between large compute investors discovering harmful capabilities and their proliferation to malicious or irresponsible actors. To effectively use this critical period, governments can implement information-sharing frameworks with large compute investors and thoroughly evaluate the risks posed by capability proliferation. Additionally, leaders can invest in and provide defensive solutions before offensive capabilities proliferate.

Governments should respond early to offense-dominant capabilities. In the future, AI models of a given performance could develop heavily offense-dominant capabilities (i.e., capabilities it is inherently difficult to defend against) or become inherently uncontrollable. Governments should closely monitor the emergence of such capabilities and preemptively develop mechanisms — including mechanisms for more tightly governing access to compute — that could substantially delay their proliferation if necessary. 

Summary

Compute efficiency describes how efficiently investments in training compute can be converted into AI capabilities. It has been rising quickly over time due to improvements in both hardware price performance and algorithmic efficiency.

Rising compute efficiency will tend to cause new AI capabilities to diffuse widely after a relatively short period of time. However, since large compute investors also benefit from rising compute efficiency, they may be able to maintain their performance leads by pushing forward the frontier. 

One governance implication is that large compute investors will remain an especially important target of oversight and regulation. At the same time, it will be necessary to prepare for — and likely, in some cases, work to delay — the widespread proliferation of dangerous capabilities. 

Appendix

Competition between developers: complicating the picture

Our analysis — based on a simple model — has shown that increases in compute efficiency do not necessarily alter the leads of large compute investors. However, some additional considerations complicate the picture.

We will start out by noting some considerations that suggest that large compute investors companies may actually achieve even greater leads in the future. We will then move to considerations that point in the opposite direction.8

Figure 4: Compute investment scaling increases the performance lead of large compute investors over time. The dashed arrows represent performance improvements attainable without investment scaling. 

Leaders can further their performance advantages through scaling investments and proprietary algorithmic advancements. Large compute investors have historically scaled their compute investment significantly faster than others, widening the investment gap to smaller actors. Additionally, the proprietary development of algorithmic and hardware enhancements might further widen this divide, consolidating leaders’ competitive advantage.

In zero-sum competition, small relative performance advantages may grant outsized benefits. If AI models directly compete, the developer of the leading model may reap disproportionate benefits even if their absolute performance advantage is small. Such disproportionate rewards occur in games such as chess but likely also apply to AI models used in trading, law, or entertainment.

Winner-takes-all effects may allow leaders to entrench their lead despite losing their performance advantage. By initially developing the best-performing models, large compute investors may accrue a number of advantages unrelated to performance, such as network effects and economies of scale that allow them to maintain a competitive advantage even if they approach a performance ceiling.

Performance ceilings dampen the performance effect, reducing leaders’ advantage. Many AI applications have a ceiling on technical performance or real-world usefulness. For instance, handwritten digit classifiers have achieved above 99% accuracy since the early 2000s, so further progress is insignificant. As leaders approach the ceiling, performance only marginally increases with improved compute efficiency, allowing smaller actors to catch up. 

Leaders can release their model parameters, allowing others to overcome compute investment barriers. Large compute investors can provide smaller actors access to their advanced models. While product integrations and structured access protocols allow for limited and fine-grained proliferation, releasing model parameters causes irreversible capability proliferation to a broad range of actors.

Ultimately — although increases in compute efficiency do not erode competitive advantages in any straightforward way — it is far from clear exactly how we should expect competition between developers to evolve.



Source link

19May

Research Fellows | GovAI Blog


GovAI was founded to help humanity navigate the transition to a world with advanced AI. Our first research agenda, published in 2018, helped define and shape the nascent field of AI governance. Our team and affiliate community possess expertise in a wide variety of domains, including AI regulation, responsible development practices, compute governance, AI-lab corporate governance, US-China relations, and AI progress forecasting.

GovAI researchers have advised decision makers in government, industry, and civil society. Most recently, our researchers have played a substantial role in informing the UK government’s approach to AI regulation. Our researchers have also published in top peer-reviewed journals and conferences, including International Organization, NeurIPS, and Science. Our alumni have gone on to roles in government; top AI labs, including DeepMind, OpenAI, and Anthropic; top think tanks, including the Centre for Security and Emerging Technology and RAND; and top universities, including the University of Oxford and the University of Cambridge.

GovAI also runs a range of programmes – including our Summer/Winter Fellowship Programme and our Research Scholar Programme – to support the career development of promising AI governance researchers. We are committed to both producing impactful research and strengthening the broader AI governance community.

Research Fellows will conduct research into open and important questions that bear on AI governance. We are interested in candidates from a range of disciplines who have a demonstrated ability to produce excellent research and who care deeply about the lasting impacts of AI, in line with our mission. The role would offer significant research freedom, access to a broad network of experts, and opportunities for collaboration.

Research Fellows are expected to set their own impact-focused research agenda, with guidance from other members of the team; they are also expected to offer supervision and mentorship to junior researchers, such as our Summer and Winter Fellows, and to participate in seminars. However, Research Fellows will dedicate the substantial majority of their time to research projects of their own choosing. They will be encouraged to collaborate and co-author with other members of our community but may also focus on solo projects if they choose. 

We are committed to supporting the work of Research Fellows by offering research freedom, expert guidance, funding for projects, productivity tools, limited obligations on one’s time, access to a broad network of experts and potential collaborators, and opportunities to communicate one’s research to policymakers and other audiences.

For promising researchers who lack sufficient experience, we may consider instead offering one-year visiting “Research Scholar” positions.

We are open to candidates with a wide range of research interests and intellectual backgrounds. We have previously hired or hosted researchers with backgrounds in computer science, public policy, political science, economics, history, philosophy, and law. 

You might be a particularly good fit if you have:‍

  • Demonstrated ability to produce excellent research
  • Deep interest in the lasting implications of artificial intelligence, in line with our organisation’s mission
  • Established expertise in a domain with significant AI governance relevance
  • Self-directedness and desire for impact
  • Commitment to intellectual honesty and rigour
  • Good judgement regarding the promise and importance of different research directions
  • Excellent communication and collaboration skills
  • Proactivity and commitment to professional growth
  • Strong interest in mentorship
  • Broad familiarity with the field of AI governance

There are no specific educational requirements for the role, although we expect that the most promising candidates will typically possess several years of relevant research or policy experience.

Contracts are full-time and have a fixed two-year term, with the possibility of renewal.

We prefer for Research Fellows to work primarily from our office in Oxford, UK. However, we also consider applications from strong candidates who are only able to work remotely. We are able to sponsor visas in the UK and the US.

Research Fellows will be compensated in line with our salary principles. Depending on their experience, we expect that successful candidates’ annual compensation will typically fall between £60,000 and £80,000 if based in Oxford, UK. In cases where a Research Fellow resides predominantly in a city with a higher cost of living, this salary will be adjusted to account for the difference. In exceptional cases, there may be some flexibility in compensation. 

Benefits associated with the role include a £5,000 annual wellbeing budget; a £1,500 annual commuting budget; a budget for any necessary purchases of books or work equipment; private health, dental, and vision insurance; a 10% employer pension contribution; and 25 days of paid vacation in addition to public holidays.

Please inquire through

co*****@go********.ai











if questions or concerns regarding compensation or benefits might affect your decision to apply.

The first stage of the process involves filling in an application form. (The front page of the form lists the required material, which includes a 2–5-page explanation of what you might work on as a Research Fellow.) The second round involves completing a paid remote work test. Candidates who pass through the second round should expect to participate in a set of interviews and may also be asked to produce additional written material. Please feel free to reach out to

co*****@go********.ai











if you would need a decision communicated by a particular date or if you have questions about the application process.

We are committed to fostering a culture of inclusion, and we encourage individuals with underrepresented perspectives and backgrounds to apply. We especially encourage applications from women, gender minorities, people of colour, and people from regions other than North America and Western Europe who are excited about contributing to our mission. We are an equal opportunity employer.

We would also like to highlight that we are inviting applications to Research Scholar positions (general track or policy track) right now. These are one-year visiting positions intended to support the career development of researchers who hope to positively influence the lasting impact of artificial intelligence.



Source link

19May

Summer Fellowship 2024 | GovAI Blog


GovAI’s mission is to help humanity navigate the transition to a world with advanced AI. Our world-class research has helped shape the nascent field of AI governance. Our team and affiliate community possess expertise in a wide variety of domains, including US-China relations, arms race dynamics, EU policy, and AI progress forecasting.

We are looking for early-career individuals or individuals new to the field of AI governance to join our team for three months and learn about the field of AI governance while making connections with other researchers and practitioners. This opportunity will be a particularly good fit for individuals who are excited to use their careers to shape the lasting implications of AI.

Summer and Winter Fellows join GovAI to conduct independent research on a topic of their choice, with mentorship from leading experts in the field of AI governance. Fellows will also join a series of Q&A sessions with AI governance experts, research seminars, and researcher work-in-progress meetings. Each Fellow will be paired with a primary mentor from the GovAI team and be introduced to others with relevant interests and expertise, typically from our affiliate and alumni network.

You can read about the topics our previous cohort of Winter Fellows worked on here

Past Fellows have gone on to work on AI governance full-time in government or at organisations including GovAI, OpenAI, the AI Now Institute, and RAND. Others have gone on to build relevant expertise at leading universities such as MIT, Stanford University, University College London, and the University of Oxford.

As a Fellow, you will spend the first week or two of the fellowship exploring research topic options before settling on a research proposal with input from your mentors and Ben Garfinkel, GovAI’s Director.

Emma Bluemke, GovAI’s Research Manager, will support you in deciding what project and output will be most valuable for you to work towards, for example, publishing a report, journal article, or blog post. You will also take time to explore the wider AI governance space and discuss follow-on career opportunities in the field of AI governance with our team.

We strongly encourage you to apply if you have an interest in our work and are considering using your career to study or shape the long-term implications of advanced AI.

Given the multidisciplinary nature of our work, we are interested in candidates from a broad set of disciplines including political science, public policy, history, economics, sociology, law, philosophy, and computer science. We are particularly interested in hosting more researchers with strong technical backgrounds. There are no specific educational requirements for the role, although we expect that the most promising candidates will typically have relevant graduate study or research experience in related areas.

When assessing applications, we will be looking for candidates who have the following strengths or show positive signs of being able to develop them:

Quality of work: The ability to produce clearly written, insightful, and even-handed research. We are particularly excited about strong reasoning ability and clear and concise writing.

Relevant expertise: Skills or knowledge that are likely to be helpful for work on AI governance. We think that relevant expertise can take many different forms. Note that we also do not have any strict degree requirements.

Judgement: The ability to prioritise between different research directions, and good intuitions about the feasibility of different research directions.

Team Fit: Openness to feedback, commitment to intellectual honesty and rigour, comfort in expressing uncertainty, and a serious interest in using your career to contribute to AI governance.

Summer and Winter Fellowships last for three months, and Fellows will receive a stipend of £9,000, plus support for travelling to Oxford. While in Oxford, we provide our Fellows with lunch on weekdays and a desk in our office. This is intended to be a full-time and in-person role, based in Oxford, UK. We are able to sponsor visas. For successful applicants who require a visa, note that you will need to remain in your country of visa application for some time while the visa application is underway. 

Summer Fellows will join for three months, from June to August (precise dates TBC). In exceptional cases, fellows may join us off-season. Please feel free to reach out if you would not be able to join during a standard visiting period.

Applications for the 2024 Summer Fellowship are now closed. The application process consists of a written submission in the first round, a remote work test in the second round, and an interview in the final round. The first page of the application form contains a description of the materials required for the first round. We expect to reach out to Summer Fellowship candidates for paid work tests in January, offer interviews in early February, and communicate final decisions to candidates in late February. Please feel free to reach out if you would need a decision communicated earlier than the standard timeline (this may or may not be possible), or have questions about the application process.

We accept applications from anywhere in the world. We are committed to fostering a culture of inclusion, and we encourage individuals with diverse backgrounds and experiences to apply. We especially encourage applications from women, gender minorities, and people of colour who are excited about contributing to our mission. We are an equal opportunity employer. If you are concerned that you’re not the right fit but have a strong interest in the Fellowship, we encourage you to apply anyway.



Source link

19May

Research Scholar (General) | GovAI Blog


Note: There is a single, shared application form and application process for all Research Scholar position listings.

GovAI was founded to help humanity navigate the transition to a world with advanced AI. Our first research agenda, published in 2018, helped define and shape the nascent field of AI governance. Our team and affiliate community possess expertise in a wide variety of domains, including AI regulation, responsible development practices, compute governance, AI company corporate governance, US-China relations, and AI progress forecasting.

GovAI researchers — particularly those working within our Policy Team — have closely advised decision makers in government, industry, and civil society. Our researchers have also published in top peer-reviewed journals and conferences, including International Organization, NeurIPS, and Science. Our alumni have gone on to roles in government, in both the US and UK; top AI companies, including DeepMind, OpenAI, and Anthropic; top think tanks, including the Centre for Security and Emerging Technology and RAND; and top universities, including the University of Oxford and the University of Cambridge.

Although we are based in Oxford, United Kingdom — and currently have an especially large UK policy focus — we also have team members in the United States and European Union.

Research Scholar is a one-year visiting position. It is designed to support the career development of AI governance researchers and practitioners — as well as to offer them an opportunity to do high-impact work.

As a Research Scholar, you will have freedom to pursue a wide range of styles of work. This could include conducting policy research, social science research, or technical research; engaging with and advising policymakers; or launching and managing applied projects. 

For example, past and present Scholars have used the role to:

  • produce an influential report on the benefits and risks of open-source AI;
  • conduct technical research into questions that bear on compute governance;
  • take part in the UK policy-making process as a part-time secondee in the UK government; and
  • launch a new organisation to facilitate international AI governance dialogues.

Over the course of the year, you will also deepen your understanding of the field, connect with a network of experts, and build your skills and professional profile, all while working within an institutional home that offers both flexibility and support.

You will receive research supervision from a member of the GovAI team or network. The frequency of supervisor meetings and feedback will vary depending on supervisor availability, although once-a-week or once-every-two-weeks supervision meetings are typical. There will also be a number of additional opportunities for Research Scholars to receive feedback, including internal work-in-progress seminars. You will receive further support from Emma Bluemke, GovAI’s Research Manager.

Some Research Scholars may also — depending on the focus of their work — take part in GovAI’s Policy Team, which is led by Markus Anderljung. Members of the GovAI Policy Team do an especially large amount of policy engagement and coordinate their work more substantially. They also have additional team meetings and retreats. While Policy Team members retain significant freedom to choose projects, there is also an expectation that a meaningful portion of their work will fit into the team’s joint priorities.

We are open to work on a broad range of topics. To get a sense of our focus areas, you may find it useful to read our About page or look at examples listed on our Research page. Broad topics of interest include — but are not limited to — responsible AI development and release practices, AI regulation, international governance, compute governance, and risk assessment and forecasting.

We are open to candidates with a wide range of backgrounds. We have previously hired or hosted researchers with academic backgrounds in computer science, political science, public policy, economics, history, philosophy, and law. We are also interested in candidates with professional backgrounds in government, industry, and civil society.

For all candidates, we will look for:

  • A strong interest in using their career to positively influence the lasting impact of artificial intelligence, in line with our organisation’s mission
  • Demonstrated ability to produce excellent work (typically research outputs) or achieve impressive results
  • Self-direction and proactivity
  • The ability to evaluate and prioritise projects on the basis of impact
  • A commitment to intellectual honesty and rigour
  • Receptiveness to feedback and commitment to self-improvement
  • Strong communication skills
  • Collaborativeness and motivation to help others succeed
  • Some familiarity with the field of AI governance
  • Some expertise in a domain that is relevant to AI governance 
  • A compelling explanation of how the Research Scholar position may help them to have a large impact

For candidates who are hoping to do particular kinds of work (e.g. technical research) or work on particular topics (e.g. US policy), we will also look for expertise and experience that is relevant to the particular kind of work they intend to do.

There are no educational requirements for the role. We have previously made offers to candidates at a wide variety of career stages. However, we expect that the most promising candidates will typically have either graduate study or relevant professional experience.

Duration

Contracts will be for a fixed 12-month term. Although renewal is not an option for these roles, Research Scholars may apply for longer-term positions at GovAI — for instance, Research Fellow positions — once their contracts end.

Location

Although GovAI is based in Oxford, we are a hybrid organisation. Historically, a slight majority of our Research Scholars have actually chosen to be based in countries other than the UK. However, in some cases, we do have significant location preferences:

  • If a candidate plans to focus heavily on work related to a particular government’s policies, then we generally prefer that the candidate is primarily based in or near the most relevant city. For example, if someone plans to focus heavily on US federal policy, we will tend to prefer that they are based in or near Washington, DC.

  • If a candidate would likely be involved in managing projects or launching new initiatives to a significant degree, then we will generally prefer that they are primarily based out of our Oxford office.

  • Some potential Oxford-based supervisors (e.g. Ben Garfinkel) also have a significant preference for their supervisees being primarily based in Oxford.

If you have location restrictions – and concerns about your ability to work remotely might prevent you from applying – please inquire at

re*********@go********.ai











. Note that we are able to sponsor both UK visas and US visas.

Salary

Depending on their experience, we expect that successful candidates’ annual compensation will typically fall between £60,000 (~$75,000) and £75,000 (~$95,000) if based in Oxford, UK. If a Research Scholar resides predominantly in a city with a higher cost of living, their salary will be adjusted to account for the difference. As reference points, a Research Scholar with five years of relevant postgraduate experience would receive about £66,000 (~$83,000) if based in Oxford and about $94,000 if based in Washington DC. In rare cases where salary considerations would prevent a candidate from accepting an offer, there may also be some flexibility in compensation.

Benefits associated with the role include health, dental, and vision insurance, a £5,000 (~$6,000) annual wellbeing budget, an annual commuting budget, flexible work hours, extended parental leave, ergonomic equipment, a competitive pension contribution, and 25 days of paid vacation in addition to public holidays.

Please inquire with

re*********@go********.ai











if questions or concerns regarding compensation or benefits might affect your decision to apply.

Applications for this position are now closed. The application process consists of a written submission in the first round, a paid remote work test in the second round, and a final interview round. The interview round usually consists of one interview but might involve an additional interview in some cases. We also conduct reference checks for all candidates we interview.

Please feel free to reach out to

re*********@go********.ai











if you would need a decision communicated by a particular date, if you need assistance with the application due to a disability, or if you have questions about the application process. If you have any questions specifically related to the GovAI Policy Team, feel free to reach out to

ma***************@go********.ai











.

We are committed to fostering a culture of inclusion, and we encourage individuals with underrepresented perspectives and backgrounds to apply. We especially encourage applications from women, gender minorities, people of colour, and people from regions other than North America and Western Europe who are excited about contributing to our mission. We are an equal opportunity employer.



Source link

19May

Research Manager | GovAI Blog


GovAI was founded to help humanity navigate the transition to a world with advanced AI. Our first research agenda, published in 2018, helped define and shape the nascent field of AI governance. Our team and affiliate community possess expertise in a wide variety of domains, including AI regulation, responsible development practices, compute governance, AI company corporate governance, US-China relations, and AI progress forecasting.

GovAI researchers have closely advised decision makers in government, industry, and civil society. Our researchers have also published in top peer-reviewed journals and conferences, including International Organization, NeurIPS, and Science. Our alumni have gone on to roles in government, in both the US and UK; top AI companies, including DeepMind, OpenAI, and Anthropic; top think tanks, including the Centre for Security and Emerging Technology and RAND; and top universities, including the University of Oxford and the University of Cambridge.

As Research Manager, you will be responsible for managing and continually improving the systems that underlie our research pipeline. 

Responsibilities will include: 

  • Building, overseeing, and refining systems for project selection, feedback, publication, and dissemination.
  • Providing operational support to researchers, for instance facilitating the selection of research assistants and managing copy-editing.
  • Improving the intellectual environment at GovAI by coming up with helpful events with internal and external guests, as well as designing other measures that facilitate intellectual engagement (e.g. the structure of our physical and virtual spaces).
  • Serving as an additional source of individual support and accountability for some researchers.
  • Helping researchers communicate their work to relevant audiences, by identifying appropriate channels, unlocking those channels, and helping researchers shape their work to fit those channels. This also includes being responsible for our quarterly newsletter and other organisational communication mostly focused on research.
  • Being the point person for requests for collaboration, speaking opportunities, and other researcher interactions with outside stakeholders. Potentially proactively identifying such opportunities and pitching them to researchers.
  • For candidates with sufficiently strong writing skills, writing or helping researchers to write summaries of their work for the GovAI blog or other venues.

We’re selecting candidates who are:

  • Excited by the opportunity to use their careers to positively influence the lasting impact of artificial intelligence, in line with our organisation’s mission.
  • Organised and competent at project management. This role will require the ability to manage concurrent work streams, and we need someone who can demonstrate highly structured work habits, confidence in prioritising between tasks, and a conscientious approach to organisation.
  • Driven by a desire to produce excellent work and achieve valuable results. Successful candidates will actively seek out feedback and opportunities to improve their skills.
  • Highly autonomous and proactive. Successful candidates will proactively identify pain points and inefficiencies in GovAI’s research process and set out to fix them.
  • Able to support our researchers in overcoming challenges in their work and to hold them accountable for their projects. Experience with research or research management is a strong plus.
  • Ideally, knowledgeable about the field of AI governance and GovAI’s work. While not a fixed requirement, a solid understanding of current topics in the field – like responsible scaling policies, capabilities evaluations, and compute governance – will be a strong plus.
  • Excellent at oral and written communication. This role will require clear and prompt communication with a wide range of stakeholders, both over email and in person.

This position is full-time. Our offices are located in Oxford, UK, and we strongly prefer team members to be based here, although we are open to hiring individuals who require short periods of remote work. We are able to sponsor visas. 

The Research Manager will be compensated in line with our salary principles. As such, the salary for this role will depend on the successful applicant’s experience, but we expect the range to be between £60,000 (~$75,000) and £75,000 (~$94,000). In rare cases where salary considerations would prevent a candidate from accepting an offer, there may also be some flexibility in compensation. 

Benefits associated with the role include health, dental, and vision insurance, a £5,000 annual wellbeing budget, an annual commuting budget, flexible work hours, extended parental leave, ergonomic equipment, a 10% pension contribution, and 33 days of paid vacation (including Bank Holidays).

The application process consists of a written submission in the first round, a paid remote work test in the second round, and an interview in the final round. We also conduct reference checks for all candidates we interview. Please apply using the form linked below.

GovAI is committed to fostering a culture of inclusion and we encourage individuals with underrepresented perspectives and backgrounds to apply. We especially encourage applications from women, gender minorities, people of colour, and people from regions other than North America and Western Europe who are excited about contributing to our mission. We are an equal opportunity employer and want to make it as easy as possible for everyone who joins our team to thrive in our workplace. 

If you would need a decision communicated by a particular date, need assistance with the application due to a disability, or have any other questions about applying, please email

re*********@go********.ai











.



Source link

19May

Summer Fellowship 2023 Wrap Up – What Did Our Fellows Work On?


The Summer and Winter Fellowships offer an opportunity for up-and-coming individuals to invest three months in AI governance research projects, deepen their knowledge of the field, and forge connections with fellow researchers and practitioners.

Our Summer Fellows come from a variety of disciplines and a range of prior experience – some fellows ventured into entirely new intellectual territory for their projects, and some fellows used the time to extend their previous work.

We extend our sincere appreciation to all our supervisors for their dedicated mentorship and guidance this summer, as well as their commitment to nurturing the next generation of researchers.

If you’re interested in applying for future fellowships, check out our Opportunities page. You can register your expression of interest here.



Source link

19May

Winter Fellowship 2023 Wrap Up – What Did Our Fellows Work On?


Our 2023 Winter Fellowship recently ended, and we’re proud to highlight what our Winter Fellows have been up to.

Summer and Winter Fellowships provide an opportunity for early-career individuals to spend three months working on an AI governance research project, learning about the field, and making connections with other researchers and practitioners. 

Winter Fellows come from a variety of disciplines and a range of prior experience – some fellows ventured into entirely new intellectual territory for their projects, and some fellows used the time to extend their previous work. 

We gratefully thank all of the supervisors for their mentorship and guidance this winter, and for dedicating time to training the next generation of researchers. 

If you’re interested in applying for future fellowships, check out our Opportunities page. You can register your expression of interest here.



Source link

19May

Winter Fellowship 2024 Wrap Up – What Did Our Fellows Work On?


The Summer and Winter Fellowships offer an opportunity for up-and-coming individuals to invest three months in AI governance research projects, deepen their knowledge of the field, and forge connections with fellow researchers and practitioners.

Our Winter Fellows come from a variety of disciplines and a range of prior experience – some fellows ventured into entirely new intellectual territory for their projects, and some fellows used the time to extend their previous work.

We extend our sincere appreciation to all our supervisors for their dedicated mentorship and guidance this winter, as well as their commitment to nurturing the next generation of researchers.

If you’re interested in applying for future fellowships, check out our Opportunities page. You can register your expression of interest here.



Source link

19May

Evaluating Predictions of Model Behaviour


GovAI research blog posts represent the views of their authors, rather than the views of the organisation.

Introduction

Some existing AI systems have the potential to cause harm, for example through the misuse of their capabilities, through reliability issues, or through systemic bias. As AI systems become more capable, the scale of potential harm could increase. In order to make responsible decisions about whether and how to deploy new AI systems, it is important to be able to predict how they may behave when they are put into use in the real world.

One approach to predicting how models will behave in the real world is to run model evaluations. Model evaluations are tests for specific model capabilities (such as the ability to offer useful instructions on building weapons) and model tendencies (such as a tendency to exhibit gender bias when rating job applications). Although model evaluations can identify some harmful behaviours, it can be unclear how much information they provide about a model’s real-world behaviour. The real world is often different from what can be captured in a model evaluation. In particular, once a model is deployed, it will be exposed to a much wider range of circumstances (e.g. user requests) than it can be exposed to in the lab.

To address this problem, I suggest implementing prediction evaluations to assess an actor’s ability to predict how model evaluation results will translate to a broader range of situations. In a prediction evaluation, an initial set of model evaluations is run on a model. An actor — such as the model evaluation team within an AI company —  then attempts to predict the results of a separate set of model evaluations, based on the initial results. Prediction evaluations could fit into AI governance by helping to calibrate trust in model evaluations. For example, a developer could use prediction evaluations internally to gauge whether further investigation of a model’s safety properties is warranted.  

More work is required to understand whether, how, and when to implement prediction evaluations. Actors that currently engage in model evaluations could experiment with prediction evaluations to make progress on this work. 

Prediction evaluations can assess how well we understand model generalisation

Deciding when it is safe to deploy a new AI system is a crucial challenge. Model evaluations – tests conducted on models to assess them for potentially harmful capabilities or propensities – can inform these decisions.1 However, models will inevitably face a much wider range of conditions in the real world than they face during evaluations. For example, users often find new prompts (which evaluators never tested) that cause language models such as GPT-4 and Claude to behave in unexpected or unintended ways.2

We therefore need to understand how model evaluation results generalise: that is, how much information model evaluations provide about how a model will behave once deployed.3 Without an understanding of generalisation, model evaluation results may lead decision-makers to mistakenly deploy models that cause much more real-world harm than anticipated.4

We propose implementing prediction evaluations5 to assess an actor’s understanding of how model evaluation results will generalise. In a prediction evaluation, an initial set of model evaluations is run on a model and provided to an actor. The actor then predicts how the model will behave on a distinct set of evaluations (test evaluations), given certain limitations on what the actor knows (e.g. about details of the test evaluations) and can do while formulating their prediction (e.g. whether they can run the model). Finally, a judge grades the actor’s prediction based on the results of running the test set evaluations. The more highly the actors score, the more likely they are to have a strong understanding of how their model evaluation results will generalise to the real world.6

Figure 1 depicts the relationship between predictions, prediction evaluations, model evaluations, and understanding of generalisation.

Figure 1: Prediction evaluations indirectly assess the level of understanding that an actor has about how its model evaluations generalise to the real world. The basic theory is: If an actor cannot predict how its model will perform when exposed to an additional set of “test evaluations”, then the actor also probably cannot predict how its model will behave in the real world.

Prediction evaluations could support AI governance in a number of ways. A developer could use the results of internally run prediction evaluations to calibrate their trust in their own model evaluations. If a model displays unexpectedly high capability levels in some contexts, for example, the developer may want to investigate further and ensure that their safety mitigations are sufficient. 

A regulator could also use the results of (potentially externally run) prediction evaluations to inform an array of safety interventions. For example, consider the context of a hypothetical licensing regime for models, in which developers must receive regulatory approval before releasing certain high-risk models. If a model developer performs poorly on prediction evaluations, their claims about the safety of a model may be less credible. A regulator could take into account this information when deciding whether to permit deployment of the model. If the developer’s predictions are poor, then the regulator could require it to evaluate its model more thoroughly.

How to run a prediction evaluation

In the appendix to this post, we provide more detail about how to run a prediction evaluation. Here, we provide a brief overview. First, the administrator of the prediction evaluation should select the model evaluations. Second, the administrator should prevent the actor from running the test evaluations when making the prediction. Finally, the administrator needs to establish standards for good prediction performance.

An example of running a prediction evaluation

Our example here focuses on a regulator in the context of a hypothetical licensing regime, in which developers of certain high-risk models require regulatory approval before these models can be deployed. Other potential examples to explore in future work could include a developer running prediction evaluations internally, a regulator running prediction evaluations on itself to assess its own understanding, or some actor running prediction evaluations on a model user (e.g. a company that uses models at a large scale).

Suppose that a developer submits a model and its evaluations to a regulator for approval. The regulator could administer a prediction evaluation to the developer through a process similar to the following:

  1. Based on the initial model evaluations that the developer submitted, the regulator builds a set of test evaluations. The test evaluations could include a wider variety of inputs than the initial model evaluations, but still feature the same category of task.
  2. The regulator puts the developer in a controlled, monitored environment, such that the developer cannot run the test evaluations on the model. 
  3. The regulator provides the developer with a detailed description of the test set evaluations. 
  4. For each test evaluation, the regulator asks the developer to predict whether the model will succeed at the task (the developer provides a “yes” or “no” answer).
  5. The developer provides a prediction to the regulator.7
  6. The regulator compares the prediction with the actual behaviour of the model on the test evaluations.8

Consider a case in which the developer does not perform much better than chance on the prediction evaluation (i.e. performs close to 50% accuracy for yes/no questions). Such performance would be evidence of a poor understanding of how the model’s behaviour generalises. As a result, greater caution from the regulator may be justified. The regulator’s response to the poor performance could vary in severity depending on the potential harm the model could cause. Some options include:

  • Requesting more extensive model evaluations before deployment
  • Subjecting deployment of the model to additional requirements, such as more stringent monitoring
  • Blocking deployment or further training until specified conditions are met, such as good performance on subsequent prediction evaluations

Further research is required to understand whether and when any of these options would be appropriate, and what other options exist.

Limitations and open questions

There is still a great deal of uncertainty about whether it is worthwhile to run prediction evaluations. For example, suppose that a developer has run an initial set of model evaluations but still is not confident about how well these model evaluations will generalise to the real world. A comparatively straightforward strategy to become more confident would be to simply run a wider range of model evaluations, without bothering to make any explicit predictions. If these additional model evaluations also suggest that the model is safe, then — even if some of the specific results have been surprising — perhaps the developer would still be justified in believing that its models will ultimately also behave safely in the real world.

Furthermore, prediction accuracy may not vary enough — between the actors who are making the predictions or between the models that the predictions concern — for it to be worthwhile to assess prediction accuracy in individual cases. For example, it may be the case that people generally cannot reliably predict the results of model evaluations very well at all. Although this general result would be useful to know, it would also reduce the value of continuing to perform prediction evaluations in individual cases.

There are also various practical questions that will need to be answered before prediction evaluations can be run and used to inform decisions. These open questions include:

  1. How feasible is it to predict behaviour on model evaluations without running the model — and how does feasibility change with information or action limits on the actor?
  2. How should we limit what the actor knows and can do in a prediction evaluation?
  3. How should the initial and test evaluations be chosen?
  4. How should the results of a prediction evaluation be reported? For example, should the actor provide different predictions corresponding to different amounts of compute used?

If prediction evaluations should ultimately be built into a broader AI governance regime, then a number of additional questions arise. 

  1. Who should administer prediction evaluations?
  2. Which actors should undergo prediction evaluations?
  3. How can prediction evaluations incentivise improvements in understanding?
  4. What is the role of prediction evaluations in an overall evaluation process?

Fortunately, there are immediate opportunities to make progress on these questions. For instance, to tackle questions 1-4, those developing and running evaluations on their models can at the same time run prediction evaluations internally. For such low-stakes experiments, one may easily be able to vary the amount of time, information, or compute given for the prediction evaluation and experiment with different reporting procedures.9

Conclusion

To make informed development and deployment decisions, decision-makers need to be able to predict how AI systems will behave in the real world. Model evaluations can help to inform these predictions by showing how AI systems behave in particular circumstances. 

Unfortunately, it is often unclear how the results of model evaluations generalise to the real world. For example, a model may behave well in the circumstances tested by a particular model evaluation, but then behave poorly in other circumstances it encounters in the real world.

Prediction evaluations may help to address this problem, by testing how well an actor can predict how model evaluations will generalise to some additional circumstances. Scoring well on a prediction evaluation is evidence that the actor is capable of using the model evaluations to make informed decisions.

However, further work is needed to understand whether, how, and when to use prediction evaluations.

The author of this piece would like to thank the following people for helpful comments on this work: Ross Gruetzemacher, Toby Shevlane, Gabe Mukobi, Yawen Duan, David Krueger, Anton Korinek, Malcolm Murray, Jan Brauner, Lennart Heim, Emma Bluemke, Jide Alaga, Noemi Dreksler, Patrick Levermore, and Lujain Ibrahim. Thanks especially to Ben Garfinkel, Stephen Clare, and Markus Anderljung for extensive discussions and feedback.

Alan Chan can be contacted at al*******@go********.ai

Appendix

Running a prediction evaluation 

This section describes each step in a prediction evaluation in more detail.

Selecting the model evaluations

The first step is choosing the initial and test set evaluations.

Since the science of model evaluations is still developing, it is not obvious which specific evaluations should be used for prediction evaluations. One hypothesis is that they should target specific use cases, such as ways to misuse models for cyberattacks. Such specific targeting may be desirable because understanding of generalisation in one use case may not transfer to understanding in another use case. That makes it more important to understand model generalisation in high-stakes use cases. On the other hand, it may be easier to work in simpler, but not necessarily realistic, environments. Such environments may provide clearer insights into generalisation,10 but the insights may not be immediately relevant to any deployment setting. 

To separate test evaluations from initial evaluations, one should try to account for the range of conditions the model might face in the real world. For example, test evaluations may test a more diverse range of inputs to the model. When evaluating whether the model can complete complex tasks, it may also be important to vary how the environment responds to the model’s actions. One could vary the tools (e.g. access to web search) available to models in the initial and test evaluations to simulate how users may augment models with different tools following deployment.11 Initial and test evaluations could even assess the completion of different tasks. For instance, we may be interested in a model’s ability to assist in the creation of chemical weapons. Test evaluations could focus on a different set of chemical weapons than the initial evaluations. 

Preventing the actor from running the test evaluations

For a prediction evaluation to provide good evidence of understanding of generalisation, we need to prevent the actor from simply running the test evaluations and reporting the results. The incentive to cheat in this way may be strongest when a regulator subjects a model developer to a prediction evaluation. Prediction evaluations that a developer runs internally may not suffer from this problem as significantly. 

Ideally, we want the actor to predict accurately only because they have gained some insight12 into generalisation. To achieve this goal, we can limit the actor’s knowledge and possible actions. Future research could investigate the effectiveness of the following possibilities:

  • Limiting the actor’s knowledge
    • Only providing the actor a high-level description of the test set evaluations
    • Describing to the actor some but not all of the test set evaluations
    • Limiting the number of initial evaluations provided to the actor
  • Limiting the actor’s possible actions
    • Limiting the extent to which the actor can interact with the model
    • Limiting the amount of time an actor has to provide predictions
    • Limiting the amount of computing power (compute) the actor can use to make the prediction. 

Crucially, the limitations should not be so strict that it becomes impossible to make accurate predictions even for well-understood models. 

Assessing prediction evaluations

The final step is to assess the accuracy of the predictions and make decisions based on the results. A key issue is to define what good prediction performance is. It may prove difficult to make accurate predictions even after extensive efforts are made to understand a model’s behaviour. As well, different standards may be appropriate for different applications of prediction evaluations (e.g., a model developer internally running prediction evaluations vs. a regulator administering prediction evaluations to a model developer). 

A potential standard could require that the actor must have higher prediction accuracy than some reference class. For example, consider an independent control group whose members have no detailed knowledge of the model, except basic information such as training compute and model size. An actor that predicts worse than the expert group likely does not have a strong understanding of how model evaluation results generalise.

In the context of a decision about model deployment, the direction in which a prediction is inaccurate may be a key consideration. Underestimating a model’s capabilities (or overestimating its degree of safety) may be more costly than overestimating them (analogously, underestimating its degree of safety) because greater societal harm could result from the model’s deployment. 

A regulator could more heavily penalise underestimation, but in so doing may create strong incentives to overestimate a model’s capabilities. Ideally, prediction evaluation should incentivise efforts to gain understanding. One potential solution could be to assess the work that actors produce to justify their predictions, in addition to the predictions themselves. Estimates based on faulty or vague reasoning could be judged to be inferior to the same estimates with good reasoning. Alternatively, the regulator could try to identify and penalise consistent overestimation across a number of different prediction evaluations.



Source link

Protected by Security by CleanTalk