Vegas Pro (formerly known as Sony Vegas) is a professional video editing software package for non-linear editing (NLE), designed to run on the Microsoft Windows operating system. The first release of Vegas Beta was on June 11, 1999. Vegas was originally developed as a non-linear audio editing application. Version 2.0 would split the program into audio and video editing variants, with the former being dropped by version 4.0, making the video offering the only variant available to consumers. Vegas Pro features real-time multi-track video and audio editing on unlimited tracks, resolution-independent video sequencing, complex effects, compositing tools, 24-bit/192 kHz audio support, VST and DirectX plug-in effect support, and Dolby Digital surround sound mixing. The software was originally published by Sonic Foundry until May 2003, when Sony purchased Sonic Foundry and formed Sony Creative Software. On May 24, 2016, Sony announced that Vegas was sold to MAGIX, which formed VEGAS Creative Software, to continue support and development of the software. As of the end of March 2026, it was publicly announced that Boris FX had taken ownership of Vegas Pro. Each release of Vegas is sold standalone; however, upgrade discounts are sometimes provided. == Features == Vegas does not require any specialized hardware to run properly, allowing it to operate on any Windows computer that meets the system requirements. == History == Vegas 1.0 was released after a brief public beta by Sonic Foundry on July 23, 1999 at the NAMM Show in Nashville, Tennessee as an audio-only tool with a particular focus on re-scaling and resampling audio. It supported formats like DivX and Real Networks RealSystem G2 file formats. Martin Walker from Sound on Sound described working in Vegas 1.0 as a "very pleasurable experience, especially since so many functions are highly intuitive" though also criticizing some features as hard to figure out due to the lack of a central help file. Later, on June 12, 2000, Vegas Video and Audio 2.0 (also referred to as just Vegas 2.0) was released, with its beta releasing earlier that year on April 10. This was the first version of Vegas to include video-editing tools and was also the first to have a low-cost "LE" version alongside the regular release. The LE releases would continue through version 3.0 of Vegas but would be discontinued by the release of Vegas 4.0. Vegas 3.0 was released the next year on December 3, and added new video effects, features for ease-of-use with DV, and support for editing Windows Media files. Vegas 4.0 was released on 6 February 2003 and added application scripting, advanced color correction, 5.1 surround sound mixing, and Steinberg ASIO support. This was the last release under the Sonic Foundry name after it sold much of its software suite, including Sound Forge and Acid Pro, to Sony Pictures Digital for $18 million later in 2003. Under Sony's ownership, Vegas 5.0 was released on April 19, 2004, bringing 3D track motion, compositing, reversing, envelope automation, etc. 7.0 also added an improved video preview, enhanced layout management, improved snapping, and more customization. With the release of 8.0, Sony opted to go back to the original "Vegas Pro" branding that the first version released with. It added the ability to burn Blu-ray and DVD optical media, support for 32-bit floating point audio, support for tempo-based audio effects, and more. It also moved the timeline to the bottom of the window by default with the option of moving it back to the top if the user wished to. Sony was also experimenting with 64-bit at this time and ported Vegas Pro 8.0 to 64-bit systems under the name "Vegas Pro 8.1". Vegas Pro 9.0 added support for 4K resolution and pro camcorder formats like Red and XDCAM EX. In 2009, Sony Creative Software purchased the Velvetmatter Radiance suite of video FX plug-ins which were included in Sony Vegas Pro 9.0. As a result, they were no longer available as a separate product from Velvetmatter. Vegas Pro 10 was released in 2010 with stereoscopic 3D editing, image stabilization, OpenFX plugin support, real-time audio event effects, and a few UI changes. This was the last release to include support for Windows XP. Vegas Pro 11 was released the next year on 17 October, with GPGPU video acceleration, enhanced text tools, enhanced stereoscopic/3D features, RAW photo support, and new event synchronization mechanisms. In addition, Vegas Pro 11 comes pre-loaded with "NewBlue" Titler Pro, a 2D and 3D titling plug-in. Vegas Pro 12 would add two new configurations: Vegas Pro 12 Edit, for "Professional Video and Audio Production"; and Vegas Pro 12 Suite, for "Professional Editing, Disc Authoring, and Visual Effects Design". Vegas Pro 13 would be the last version released with Sony branding after the acquisition of much of Sony Creative Software's library by Magix. After they acquired Vegas, Magix released version 14 on September 20, 2016. It featured advanced 4K upscaling as well as many bug fixes, a higher video velocity limit, RED camera support, and a variety of other features. This was also the last version to have the light theme enabled by default. Released on August 28, 2017, Vegas Pro 15 features major UI changes that claim to bring usability improvements and customization. It was the first version of VEGAS Pro to have a dark theme; it also allows more efficient editing speeds, including adding new shortcuts to speed the video editing process. Vegas Pro 15 includes support for Intel Quick Sync Video (QSV) and other technologies, as well as various other features. It introduced a new VEGAS Pro icon as a V. Vegas Pro 16 has some new features including file backup, motion tracking, improved video stabilization, 360° editing and HDR support. Magix has continued to improve Vegas through version 21 with support for reading Matroska files, a more detailed render dialogue, live streaming, VST3 support, a VST 32-bit bridge, and a selective Paste Event Attributes menu. Magix would later release a subscription model for using Vegas named "Vegas Pro 365" on January 17, 2018, although the perpetual licence is still an option for customers. This version includes cloud-based speech synthesis among other features not included in the mainline Vegas release. == Version history == Each release of Vegas is sold standalone, however upgrade discounts are sometimes provided. === Vegas Beta === Sonic Foundry introduced a sneak preview version of Vegas Pro on June 11, 1999. It is called a "Multitrack Media Editing System". === Vegas 1.0 === Released on July 23, 1999 at the NAMM Show in Nashville, Tennessee, Vegas was an audio-only tool with a particular focus on rescaling and resampling audio. It supported formats like DivX and Real Networks RealSystem G2 file formats. Version 1.0 is the final Vegas release to include Windows 95 support. === Vegas Video beta (Vegas 2.0 beta) === Released on April 10, 2000, this was the first version of Vegas to include video-editing tools. === Vegas Video (Vegas 2.0) === Released on June 12, 2000. Version 2.0 is the final Vegas Video release to include Windows NT 4.0 support. === Vegas Video 3.0 === Released on December 3, 2001. This release added: New Video Effects – Lens Flare, Light Rays, Film FX, Color Curves, Mirror, Remap, Deform, Convolution, Linear Blur, Black Restore, Levels, Unsharp Mask, Color Grading, and Timecode Burn filter. Batch Capture with Automatic Scene Detection – Captures DV with automatic scene detection, batch capture, tape logging, still image capture and thumbnail previews. Red Book Audio CD Mastering with CD Architect (TM) Technology – Used for burning Red Book audio CD masters directly from the Vegas timeline with ISRC, UPC, and PQ list support. New Sonic Foundry DV Codec – Introduces a DV codec developed by Sonic Foundry that offers artifact-free compositing and DV chromakeying. DV Print-to-Tape from the Timeline – Prints projects to DV cameras and decks from the Vegas timeline. Windows Media (TM) File Editing – Creates and edits Windows Media (TM) files. New MPEG Encoding Tools – Used for producing MPEG-2 files for DVD productions. Dynamic RAM Previewing – Temporary RAM/render-free previews for analysis and tweaking of complex video FX without rendering. VideoCD and Data CD Burning – Burning projects directly to VideoCD for playback on most DVD players or data CDs for playback computers' CD-ROMs. === Vegas 4.0 === Released on February 6, 2003. This release added: Advanced Color Correction Tools Searchable Media Pool Bins Vectorscope, Histogram, Parade and Waveform Monitoring Application Scripting Improved Ripple Editing Motion Blur and Super-Sampling Envelopes 5.1 Surround Mixing Dolby® Digital AC-3 Encoding certified and tested by Dolby Laboratories DirectX® Audio Plug-In Effects Automation ASIO Driver Support Windows Media™ 9 Support, including Surround Encoding DVD Authoring with AC-3 File Import Capabilities Integration with DVD Architect via Chap
Referring expression generation
Referring expression generation (REG) is the subtask of natural language generation (NLG) that received most scholarly attention. While NLG is concerned with the conversion of non-linguistic information into natural language, REG focuses only on the creation of referring expressions (noun phrases) that identify specific entities called targets. This task can be split into two sections. The content selection part determines which set of properties distinguish the intended target and the linguistic realization part defines how these properties are translated into natural language. A variety of algorithms have been developed in the NLG community to generate different types of referring expressions. == Types of referring expressions == A referring expression (RE), in linguistics, is any noun phrase, or surrogate for a noun phrase, whose function in discourse is to identify some individual object (thing, being, event...) The technical terminology for identify differs a great deal from one school of linguistics to another. The most widespread term is probably refer, and a thing identified is a referent, as for example in the work of John Lyons. In linguistics, the study of reference relations belongs to pragmatics, the study of language use, though it is also a matter of great interest to philosophers, especially those wishing to understand the nature of knowledge, perception and cognition more generally. Various devices can be used for reference: determiners, pronouns, proper names... Reference relations can be of different kinds; referents can be in a "real" or imaginary world, in discourse itself, and they may be singular, plural, or collective. === Pronouns === The simplest type of referring expressions are pronoun such as he and it. The linguistics and natural language processing communities have developed various models for predicting anaphor referents, such as centering theory, and ideally referring-expression generation would be based on such models. However most NLG systems use much simpler algorithms, for example using a pronoun if the referent was mentioned in the previous sentence (or sentential clause), and no other entity of the same gender was mentioned in this sentence. === Definite noun phrases === There has been a considerable amount of research on generating definite noun phrases, such as the big red book. Much of this builds on the model proposed by Dale and Reiter. This has been extended in various ways, for example Krahmer et al. present a graph-theoretic model of definite NP generation with many nice properties. In recent years a shared-task event has compared different algorithms for definite NP generation, using the TUNA corpus. === Spatial and temporal reference === Recently there has been more research on generating referring expressions for time and space. Such references tend to be imprecise (what is the exact meaning of tonight?), and also to be interpreted in different ways by different people. Hence it may be necessary to explicitly reason about false positive vs false negative tradeoffs, and even calculate the utility of different possible referring expressions in a particular task context. === Criteria for good expressions === Ideally, a good referring expression should satisfy a number of criteria: Referential success: It should unambiguously identify the referent to the reader. Ease of comprehension: The reader should be able to quickly read and understand it. Computational complexity: The generation algorithm should be fast No false inferences: The expression should not confuse or mislead the reader by suggesting false implicatures or other pragmatic inferences. For example, a reader may be confused if he is told Sit by the brown wooden table in a context where there is only one table. == History == === Pre-2000 era === REG goes back to the early days of NLG. One of the first approaches was done by Winograd in 1972 who developed an "incremental" REG algorithm for his SHRDLU program. Afterwards researchers started to model the human abilities to create referring expressions in the 1980s. This new approach to the topic was influenced by the researchers Appelt and Kronfeld who created the programs KAMP and BERTRAND and considered referring expressions as parts of bigger speech acts. Some of their most interesting findings were the fact that referring expressions can be used to add information beyond the identification of the referent as well as the influence of communicative context and the Gricean maxims on referring expressions. Furthermore, its skepticism concerning the naturalness of minimal descriptions made Appelt and Kronfeld's research a foundation of later work on REG. The search for simple, well-defined problems changed the direction of research in the early 1990s. This new approach was led by Dale and Reiter who stressed the identification of the referent as the central goal. Like Appelt they discuss the connection between the Gricean maxims and referring expressions in their culminant paper in which they also propose a formal problem definition. Furthermore, Reiter and Dale discuss the Full Brevity and Greedy Heuristics algorithms as well as their Incremental Algorithm(IA) which became one of the most important algorithms in REG. === Later developments === After 2000 the research began to lift some of the simplifying assumptions, that had been made in early REG research in order to create more simple algorithms. Different research groups concentrated on different limitations creating several expanded algorithms. Often these extend the IA in a single perspective for example in relation to: Reference to Sets like "the t-shirt wearers" or "the green apples and the banana on the left" Relational Descriptions like "the cup on the table" or "the woman who has three children" Context Dependency, Vagueness and Gradeability include statements like "the older man" or "the car on the left" which are often unclear without a context Salience and Generation of Pronouns are highly discourse dependent making for example "she" a reference to "the (most salient) female person" Many simplifying assumptions are still in place or have just begun to be worked on. Also a combination of the different extensions has yet to be done and is called a "non-trivial enterprise" by Krahmer and van Deemter. Another important change after 2000 was the increasing use of empirical studies in order to evaluate algorithms. This development took place due to the emergence of transparent corpora. Although there are still discussions about what the best evaluation metrics are, the use of experimental evaluation has already led to a better comparability of algorithms, a discussion about the goals of REG and more task-oriented research. Furthermore, research has extended its range to related topics such as the choice of Knowledge Representation(KR) Frameworks. In this area the main question, which KR framework is most suitable for the use in REG remains open. The answer to this question depends on how well descriptions can be expressed or found. A lot of the potential of KR frameworks has been left unused so far. Some of the different approaches are the usage of: Graph search which treats relations between targets in the same way as properties. Constraint Satisfaction which allows for a separation between problem specification and the implementation. Modern Knowledge Representation which offers logical inference in for example Description Logic or Conceptual Graphs. == Problem definition == Dale and Reiter (1995) think about referring expressions as distinguishing descriptions. They define: The referent as the entity that should be described The context set as set of salient entities The contrast set or potential distractors as all elements of the context set except the referent A property as a reference to a single attribute–value pair Each entity in the domain can be characterised as a set of attribute–value pairs for example ⟨ {\displaystyle \langle } type, dog ⟩ {\displaystyle \rangle } , ⟨ {\displaystyle \langle } gender, female ⟩ {\displaystyle \rangle } or ⟨ {\displaystyle \langle } age, 10 years ⟩ {\displaystyle \rangle } . The problem then is defined as follows: Let r {\displaystyle r} be the intended referent, and C {\displaystyle C} be the contrast set. Then, a set L {\displaystyle L} of attribute–value pairs will represent a distinguishing description if the following two conditions hold: Every attribute–value pair in L {\displaystyle L} applies to r {\displaystyle r} : that is, every element of L {\displaystyle L} specifies an attribute–value that r {\displaystyle r} possesses. For every member c {\displaystyle c} of C {\displaystyle C} , there is at least one element l {\displaystyle l} of L {\displaystyle L} that does not apply to c {\displaystyle c} : that is, there is an l {\displaystyle l} in L {\displaystyle L} that specifies an attribute–value that c {\displaystyle c} does not possess. l {\displaystyle l} is said
Sycophancy (artificial intelligence)
In the field of artificial intelligence, sycophancy is a tendency of large language models (LLMs) and other AI assistants to tailor their responses to what they predict the user wants to hear rather than to what is accurate or warranted. The behavior takes several forms: an assistant may agree with a user's stated opinion even when the user is mistaken; it may abandon a correct answer after a challenge such as "are you sure?"; it may validate beliefs, decisions or self-presentation regardless of merit; or it may praise the user, their work or their ideas in unwarranted terms. The word is borrowed from the ordinary English term for fawning flattery, and is used in AI alignment and AI safety research to describe a class of misalignment failures associated with training on human feedback. Researchers at Anthropic first documented the behavior systematically in 2022. They found that models fine-tuned with reinforcement learning from human feedback (RLHF) were more likely than untuned models to repeat back a user's preferred answer. A 2023 follow-up paper, "Towards Understanding Sycophancy in Language Models", showed that five frontier assistants from OpenAI, Anthropic and Meta all exhibited the behavior, and traced its origin to biases in the human preference data used during training. Later work documented sycophancy in mathematics, medicine, academic peer review and other domains, and identified a broader category called "social sycophancy" affecting an assistant's emotional and interpersonal responses. The issue drew widespread public attention in April 2025 after OpenAI rolled back an update to its GPT-4o model. Users had reported that the assistant praised dangerous decisions, endorsed delusional thinking and offered exaggerated compliments for trivial prompts. OpenAI's post-mortem attributed the change in behavior to an additional training signal based on user thumbs-up and thumbs-down feedback. That episode, together with reporting in The New York Times, Rolling Stone and elsewhere on users drawn into delusional thinking through prolonged chatbot interaction, has been cited in litigation and in academic studies as evidence that sycophancy poses risks to user well-being. Proposed mitigations include fine-tuning on synthetic data that rewards disagreement with incorrect user statements, editing the small subset of model parameters causally responsible for the behavior, changes to the dialogue or system prompt, and benchmarks designed to surface sycophantic behavior before models are released. == Causes == The dominant explanation points to RLHF, the standard technique for aligning chat assistants with user expectations. Human annotators rank candidate model responses; a reward model is trained to predict those rankings; and the language model is then optimized against the reward model. Because human raters tend to prefer outputs that confirm their existing beliefs or flatter their work, the pipeline systematically rewards responses that agree with the annotator. Perez and colleagues at Anthropic published the first large-scale empirical evidence of the effect in 2022. They reported that RLHF training increased the probability that a model would repeat back a dialog user's preferred answer, and that larger models exhibited the behavior more strongly. Sharma and colleagues, the following year, went further and examined Anthropic's own preference data directly. Both the human raters and the reward models trained on their judgments preferred convincingly written sycophantic responses to truthful ones at a non-negligible rate. Wei and co-authors at Google DeepMind found similar results in the PaLM family, observing that both model scale and instruction tuning increased sycophancy on opinion questions. The behavior is often classified as a form of reward hacking, in which an optimization process exploits a flaw in its reward signal rather than achieving the intended objective. OpenAI's post-mortem of the April 2025 GPT-4o incident identified a more specific mechanism. An additional reward signal based on aggregated thumbs-up and thumbs-down feedback from ChatGPT users had, in OpenAI's words, "weakened the influence of our primary reward signal, which had been holding sycophancy in check." Separately, an Anthropic interpretability paper from 2025 located a linear direction in a model's internal activations corresponding to sycophantic behavior, and showed that such "persona vectors" could be used to flag sycophancy-inducing training data and to steer models away from the trait at inference time. == Measurement == The Anthropic team released SycophancyEval with its 2023 paper, supplying test sets for each of the four canonical behaviors. Two further benchmarks from Stanford followed in 2025. SycEval, applied to mathematical and medical reasoning tasks, reported an overall sycophancy rate of 58 per cent across the GPT-4o, Claude and Gemini models tested. ELEPHANT, aimed at social sycophancy, found that the eleven LLMs evaluated affirmed posts that the Reddit community r/AmITheAsshole had judged inappropriate in 42 per cent of cases, and preserved a user's face 45 percentage points more often than human respondents did. Domain-specific benchmarks have followed. BrokenMath tests robustness to plausible-looking but false mathematical claims drawn from competition problems, and reports that the best evaluated model was sycophantic in 29 per cent of cases. SYCON-Bench measures how many dialogue turns are required before a model abandons a correct position. Visual sycophancy in multimodal models has been examined with MM-SY and PENDULUM. A 2026 study by researchers at the Massachusetts Institute of Technology reported that personalization features, which adapt assistants to individual users over repeated sessions, can intensify social sycophancy. == Notable incidents == === GPT-4o rollback (April 2025) === On 25 April 2025, OpenAI completed the rollout of an update to GPT-4o, the default model used in ChatGPT at the time. Within days, users reported that the assistant had begun praising trivial messages in extravagant terms, endorsing impulsive or dangerous decisions, and reinforcing strong emotional statements without pushback. Widely shared examples included the model congratulating a user who reported stopping prescribed psychiatric medication, and praising a business plan to sell "shit on a stick" as venture-capital ready. OpenAI's chief executive, Sam Altman, wrote on 27 April that recent updates had made the model "too sycophant-y and annoying" and said fixes were in progress. The company began reverting the update on 28 April and completed the rollback for free users by 30 April. Two post-mortems followed: a short note on 29 April and a longer technical follow-up, "Expanding on what we missed with sycophancy", on 2 May. Both attributed the regression to a new training signal based on user thumbs-up and thumbs-down feedback, to inadequate pre-launch evaluation for sycophantic drift, and to the dismissal of qualitative concerns raised by internal testers before release. Reporting in CNN, Fortune and Bloomberg News treated the incident as a turning point in public awareness of the problem. === Chatbot-related psychological harm === From mid-2025 onward, news reports began to link sycophantic chatbot behavior to acute psychological harm. In June 2025, The New York Times technology reporter Kashmir Hill published an investigation centered on Eugene Torres, a Manhattan accountant with no history of mental illness, who developed a sustained delusional episode after a series of conversations with ChatGPT about simulation theory. According to the article, the assistant encouraged Torres to stop taking prescribed medication, to cut off friends and family, and at one point told him that he could fly from a nineteen-story building if he "truly believed". Futurism and Rolling Stone ran parallel investigations documenting other cases in which heavy use of ChatGPT had been associated with delusional thinking, involuntary commitment or, in at least one case, the death of a user with a pre-existing psychiatric diagnosis. A 2026 paper by researchers at the Massachusetts Institute of Technology and the University of Washington put forward a formal Bayesian model. It showed that even an ideally rational user could be drawn into what the authors call "delusional spiraling" when interacting with a sufficiently sycophantic assistant, and that the effect was not eliminated by suppressing hallucinations or by warning users in advance. The lawsuit Raine v. OpenAI, filed in San Francisco Superior Court in August 2025 by the parents of a sixteen-year-old who had died by suicide, alleges that "heightened sycophancy" was a design feature of ChatGPT that contributed to their son's death; it is the first wrongful-death suit against a large language-model provider. === Wider commentary === Mainstream coverage in outlets including The New York Times, The Washington Pos
Belief–desire–intention model
For popular psychology, the belief–desire–intention (BDI) model of human practical reasoning was developed by Michael Bratman as a way of explaining future-directed intention. BDI is fundamentally reliant on folk psychology (the 'theory theory'), which is the notion that our mental models of the world are theories. It was used as a basis for developing the belief–desire–intention software model. == Applications == BDI was part of the inspiration behind the BDI software architecture, which Bratman was also involved in developing. Here, the notion of intention was seen as a way of limiting time spent on deliberating about what to do, by eliminating choices inconsistent with current intentions. BDI has also aroused some interest in psychology. BDI formed the basis for a computational model of childlike reasoning CRIBB.
Domain adaptation
Domain adaptation is a field associated with machine learning and transfer learning. It addresses the challenge of training a model on one data distribution (the source domain) and applying it to a related but different data distribution (the target domain). A common example is spam filtering, where a model trained on emails from one user (source domain) is adapted to handle emails for another user with significantly different patterns (target domain). Domain adaptation techniques can also leverage unrelated data sources to improve learning. When multiple source distributions are involved, the problem extends to multi-source domain adaptation. Domain adaptation is a specific type of transfer learning. According to the taxonomy laid out by Pan and Yang (2010), it falls into the category of transductive transfer learning. In this setting, the source and target tasks are the same (e.g., both are object recognition), but the domains differ (different marginal distributions). This distinguishes it from inductive transfer learning (where labeled data is available for the target task) and unsupervised transfer learning (where labels are unavailable in both domains). == Classification of domain adaptation problems == Domain adaptation setups are classified in two different ways: according to the distribution shift between the domains, and according to the available data from the target domain. === Distribution shifts === Common distribution shifts are classified as follows: Covariate Shift occurs when the input distributions of the source and destination change, but the relationship between inputs and labels remains unchanged. The above-mentioned spam filtering example typically falls in this category. Namely, the distributions (patterns) of emails may differ between the domains, but emails labeled as spam in the one domain should similarly be labeled in another. Prior Shift (Label Shift) occurs when the label distribution differs between the source and target datasets, while the conditional distribution of features given labels remains the same. An example is a classifier of hair color in images from Italy (source domain) and Norway (target domain). The proportions of hair colors (labels) differ, but images within classes like blond and black-haired populations remain consistent across domains. A classifier for the Norway population can exploit this prior knowledge of class proportions to improve its estimates. Concept Shift (Conditional Shift) refers to changes in the relationship between features and labels, even if the input distribution remains the same. For instance, in medical diagnosis, the same symptoms (inputs) may indicate entirely different diseases (labels) in different populations (domains). === Data available during training === Domain adaptation problems typically assume that some data from the target domain is available during training. Problems can be classified according to the type of this available data: Unsupervised: Unlabeled data from the target domain is available, but no labeled data. In the above-mentioned example of spam filtering, this corresponds to the case where emails from the target domain (user) are available, but they are not labeled as spam. Domain adaptation methods can benefit from such unlabeled data, by comparing its distribution (patterns) with the labeled source domain data. Semi-supervised: Most data that is available from the target domain is unlabelled, but some labeled data is also available. In the above-mentioned case of spam filter design, this corresponds to the case that the target user has labeled some emails as being spam or not. Supervised: All data that is available from the target domain is labeled. In this case, domain adaptation reduces to refinement of the source domain predictor. In the above-mentioned example classification of hair-color from images, this could correspond to the refinement of a network already trained on a large dataset of labeled images from Italy, using newly available labeled images from Norway. == Formalization == Let X {\displaystyle X} be the input space (or description space) and let Y {\displaystyle Y} be the output space (or label space). The objective of a machine learning algorithm is to learn a mathematical model (a hypothesis) h : X → Y {\displaystyle h:X\to Y} able to attach a label from Y {\displaystyle Y} to an example from X {\displaystyle X} . This model is learned from a learning sample S = { ( x i , y i ) ∈ ( X × Y ) } i = 1 m {\displaystyle S=\{(x_{i},y_{i})\in (X\times Y)\}_{i=1}^{m}} . Usually in supervised learning (without domain adaptation), we suppose that the examples ( x i , y i ) ∈ S {\displaystyle (x_{i},y_{i})\in S} are drawn i.i.d. from a distribution D S {\displaystyle D_{S}} of support X × Y {\displaystyle X\times Y} (unknown and fixed). The objective is then to learn h {\displaystyle h} (from S {\displaystyle S} ) such that it commits the least error possible for labelling new examples coming from the distribution D S {\displaystyle D_{S}} . The main difference between supervised learning and domain adaptation is that in the latter situation we study two different (but related) distributions D S {\displaystyle D_{S}} and D T {\displaystyle D_{T}} on X × Y {\displaystyle X\times Y} . The domain adaptation task then consists of the transfer of knowledge from the source domain D S {\displaystyle D_{S}} to the target one D T {\displaystyle D_{T}} . The goal is then to learn h {\displaystyle h} (from labeled or unlabelled samples coming from the two domains) such that it commits as little error as possible on the target domain D T {\displaystyle D_{T}} . The major issue is the following: if a model is learned from a source domain, what is its capacity to correctly label data coming from the target domain? == Four algorithmic principles == === Reweighting algorithms === The objective is to reweight the source labeled sample such that it "looks like" the target sample (in terms of the error measure considered). === Iterative algorithms === A method for adapting consists in iteratively "auto-labeling" the target examples. The principle is simple: a model h {\displaystyle h} is learned from the labeled examples; h {\displaystyle h} automatically labels some target examples; a new model is learned from the new labeled examples. Note that there exist other iterative approaches, but they usually need target labeled examples. === Search of a common representation space === The goal is to find or construct a common representation space for the two domains. The objective is to obtain a space in which the domains are close to each other while keeping good performances on the source labeling task. This can be achieved through the use of Adversarial machine learning techniques where feature representations from samples in different domains are encouraged to be indistinguishable. === Hierarchical Bayesian Model === The goal is to construct a Bayesian hierarchical model p ( n ) {\displaystyle p(n)} , which is essentially a factorization model for counts n {\displaystyle n} , to derive domain-dependent latent representations allowing both domain-specific and globally shared latent factors. == Software packages == Several compilations of domain adaptation and transfer learning algorithms have been implemented over the past decades: SKADA (Python) ADAPT (Python) TLlib (Python) Domain-Adaptation-Toolbox (MATLAB)
ROCm
ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains, including general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), and heterogeneous computing. It offers several programming models: HIP (GPU-kernel-based programming), OpenMP (directive-based programming), and OpenCL. ROCm is free, libre and open-source software (except the GPU firmware blobs), and it is distributed under various licenses. The name initially stood for Radeon Open Compute platform; however, due to Open Compute being a registered trademark, the name no longer functions as an acronym. == Background == The first GPGPU software stack from ATI/AMD was Close to Metal, which became Stream. ROCm was launched around 2016 with the Boltzmann Initiative. ROCm stack builds upon previous AMD GPU stacks; some tools trace back to GPUOpen and others to the Heterogeneous System Architecture (HSA). === Heterogeneous System Architecture Intermediate Language === HSAIL was aimed at producing a middle-level, hardware-agnostic intermediate representation that could be JIT-compiled to the eventual hardware (GPU, FPGA...) using the appropriate finalizer. This approach was dropped for ROCm: now it builds only GPU code, using LLVM, and its AMDGPU backend that was upstreamed, although there is still research on such enhanced modularity with LLVM MLIR. == Programming abilities == ROCm as a stack ranges from the kernel driver to the end-user applications. AMD has introductory videos about AMD GCN hardware, and ROCm programming via its learning portal. One of the best technical introductions about the stack and ROCm/HIP programming, remains, to date, to be found on Reddit. == Hardware support == ROCm is primarily targeted at discrete professional GPUs, but consumer GPUs and APUs of the same architecture as a supported professional GPU are known to work with ROCm. For example, all professional GPUs of the RDNA 2 architecture are officially supported by ROCm 5.x; users report that Consumer RDNA2 units such as the Radeon 6800M APU and the Radeon 6700XT GPU also work. === Professional-grade GPUs === === Consumer-grade GPUs === == Software ecosystem == === Machine learning === Various deep learning frameworks have a ROCm backend: PyTorch TensorFlow ONNX MXNet CuPy MIOpen Caffe Iree (which uses LLVM Multi-Level Intermediate Representation (MLIR)) llama.cpp === Supercomputing === ROCm is gaining significant traction in the top 500. ROCm is used with the Exascale supercomputers El Capitan and Frontier. Some related software is to be found at AMD Infinity hub. === Other acceleration & graphics interoperation === As of version 3.0, Blender can now use HIP compute kernels for its renderer cycles. === Other languages === ==== Julia ==== Julia has the AMDGPU.jl package, which integrates with LLVM and selects components of the ROCm stack. Instead of compiling code through HIP, AMDGPU.jl uses Julia's compiler to generate LLVM IR directly, which is later consumed by LLVM to generate native device code. AMDGPU.jl uses ROCr's HSA implementation to upload native code onto the device and execute it, similar to how HIP loads its own generated device code. AMDGPU.jl also supports integration with ROCm's rocBLAS (for BLAS), rocRAND (for random number generation), and rocFFT (for FFTs). Future integration with rocALUTION, rocSOLVER, MIOpen, and certain other ROCm libraries is planned. === Software distribution === ==== Official ==== Installation instructions are provided for Linux and Windows in the official AMD ROCm documentation. ROCm software is currently spread across several public GitHub repositories. Within the main public meta-repository, there is an XML manifest for each official release: using git-repo, a version control tool built on top of Git, is the recommended way to synchronize with the stack locally. AMD starts distributing containerized applications for ROCm, notably scientific research applications gathered under AMD Infinity Hub. AMD distributes itself packages tailored to various Linux distributions. ==== Third-party ==== There is a growing third-party ecosystem packaging ROCm. Linux distributions are officially packaging (natively) ROCm, with various degrees of advancement: Arch Linux, Gentoo, Debian, Fedora , GNU Guix, and NixOS. There are Spack packages. == Components == There is one kernel-space component, ROCk, and the rest - there is roughly a hundred components in the stack - is made of user-space modules. The unofficial typographic policy is to use: uppercase ROC lowercase following for low-level libraries, i.e. ROCt, and the contrary for user-facing libraries, i.e. rocBLAS. AMD is active developing with the LLVM community, but upstreaming is not instantaneous, and as of January 2022, is still lagging. AMD still officially packages various LLVM forks for parts that are not yet upstreamed – compiler optimizations destined to remain proprietary, debug support, OpenMP offloading, etc. === Low-level === ==== ROCk – Kernel driver ==== ==== ROCm – Device libraries ==== Support libraries implemented as LLVM bitcode. These provide various utilities and functions for math operations, atomics, queries for launch parameters, on-device kernel launch, etc. ==== ROCt – Thunk ==== The thunk is responsible for all the thinking and queuing that goes into the stack. ==== ROCr – Runtime ==== The ROC runtime is a set of APIs/libraries that allows the launch of compute kernels by host applications. It is AMD's implementation of the HSA runtime API. It is different from the ROC Common Language Runtime. ==== ROCm – CompilerSupport ==== ROCm code object manager is in charge of interacting with LLVM intermediate representation. === Mid-level === ==== ROCclr Common Language Runtime ==== The common language runtime is an indirection layer adapting calls to ROCr on Linux and PAL on windows. It used to be able to route between different compilers, like the HSAIL-compiler. It is now being absorbed by the upper indirection layers (HIP and OpenCL). ==== OpenCL ==== ROCm ships its installable client driver (ICD) loader and an OpenCL implementation bundled together. As of January 2022, ROCm 4.5.2 ships OpenCL 2.2, and is lagging behind competition. ==== HIP – Heterogeneous Interface for Portability ==== The AMD implementation for its GPUs is called HIPAMD. There is also a CPU implementation mostly for demonstration purposes. ==== HIPCC ==== HIP builds a `HIPCC` compiler that either wraps Clang and compiles with LLVM open AMDGPU backend, or redirects to the NVIDIA compiler. ==== HIPIFY ==== HIPIFY is a source-to-source compiling tool. It translates CUDA to HIP and reverse, either using a Clang-based tool, or a sed-like Perl script. ==== GPUFORT ==== Like HIPIFY, GPUFORT is a tool compiling source code into other third-generation-language sources, allowing users to migrate from CUDA Fortran to HIP Fortran. It is also in the repertoire of research projects, even more so. === High-level === ROCm high-level libraries are usually consumed directly by application software, such as machine learning frameworks. Most of the following libraries are in the General Matrix Multiply (GEMM) category, which GPU architecture excels at. The majority of these user-facing libraries comes in dual-form: hip for the indirection layer that can route to Nvidia hardware, and roc for the AMD implementation. ==== rocBLAS / hipBLAS ==== rocBLAS and hipBLAS are central in high-level libraries, it is the AMD implementation for Basic Linear Algebra Subprograms. It uses the library Tensile privately. ==== rocSOLVER / hipSOLVER ==== This pair of libraries constitutes the LAPACK implementation for ROCm and is strongly coupled to rocBLAS. === Utilities === ROCm developer tools: Debug, tracer, profiler, System Management Interface, Validation suite, Cluster management. GPUOpen tools: GPU analyzer, memory visualizer... External tools: radeontop (TUI overview) == Comparison with competitors == ROCm competes with other GPU computing stacks: Nvidia CUDA and Intel OneAPI. === Nvidia CUDA === Nvidia's CUDA is closed-source, whereas AMD ROCm is open source. There is open-source software built on top of the closed-source CUDA, for instance RAPIDS. CUDA is able to run on consumer GPUs, whereas ROCm support is mostly offered for professional hardware such as AMD Instinct and AMD Radeon Pro. Nvidia provides a C/C++-centered frontend and its Parallel Thread Execution (PTX) LLVM GPU backend as the Nvidia CUDA Compiler (NVCC). === Intel OneAPI === All the oneAPI corresponding libraries are published on its GitHub Page. ==== Unified Acceleration Foundation (UXL) ==== Unified Acceleration Foundation (UXL) is a new technology consortium that are working on the continuation of the OneAPI initiative, with the goal to create a new open standard accelerator software ecosystem, related open standards and specification projects through Working Groups and Specia
Hallucination (artificial intelligence)
In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting, confabulation, or delusion) is a response generated by AI that contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where a hallucination typically involves false percepts. For example, a chatbot powered by large language models (LLMs), like ChatGPT, may embed plausible-sounding random falsehoods within its generated content. Detecting and mitigating errors and hallucinations pose significant challenges for practical deployment and reliability of LLMs in high-stakes scenarios, such as chip design, supply chain logistics, and medical diagnostics. Some software engineers and statisticians have criticized the specific term "AI hallucination" for unreasonably anthropomorphizing computers. Symbolic artificial intelligence models generally do not produce hallucinations, unlike large language models. == Term == === Origin === Since the 1980s, the term "hallucination" has been used in computer vision with a positive connotation to describe the process of adding detail to an image. For example, the task of generating high-resolution face images from low-resolution inputs is called face hallucination. The first documented use of the term "hallucination" in this sense is in the PhD thesis of Eric Mjolsness in 1986. A notable work is the face hallucination algorithm by Simon Baker and Takeo Kanade published in 1999. In the 2000s, hallucinations were described in statistical machine translation as a failure mode. Since the 2010s, the term has undergone a semantic shift to signify the generation of factually incorrect or misleading outputs by AI systems in tasks like machine translation and object detection. In 2015, hallucinations were identified in visual semantic role labeling tasks by Saurabh Gupta and Jitendra Malik. In 2015, computer scientist Andrej Karpathy used the term "hallucinated" in a blog post to describe his recurrent neural network (RNN) language model generating an incorrect citation link. In 2017, Google researchers used the term to describe the responses generated by neural machine translation (NMT) models when they are not related to the source text, and in 2018, the term was used in computer vision to describe instances where non-existent objects are erroneously detected because of adversarial attacks. In July 2021, Meta warned during its release of BlenderBot 2 that the system is prone to "hallucinations", which Meta defined as "confident statements that are not true". Following OpenAI's ChatGPT release in beta version in November 2022, some users complained that such chatbots often seem to pointlessly embed plausible-sounding random falsehoods within their generated content. Many news outlets, including The New York Times, started to use the term "hallucinations" to describe these models' frequently incorrect or inconsistent responses. In 2023, the Cambridge dictionary updated its definition of hallucination to include this new sense specific to the field of AI. Some researchers have highlighted a lack of consistency in how the term is used, but also identified several alternative terms in the literature, such as confabulations, fabrications, and factual errors. === Definitions and alternatives === Uses, definitions and characterizations of the term "hallucination" in the context of LLMs include: "a tendency to invent facts in moments of uncertainty" (OpenAI, May 2023) "a model's logical mistakes" (OpenAI, May 2023) "fabricating information entirely, but behaving as if spouting facts" (CNBC, May 2023) "making up information" (The Verge, February 2023) "probability distributions" (in scientific contexts) Journalist Benj Edwards, in Ars Technica, writes that the term "hallucination" is controversial, but that some form of metaphor remains necessary; Edwards suggests "confabulation" as an analogy for processes that involve "creative gap-filling". In July 2024, a White House report on fostering public trust in AI research mentioned hallucinations only in the context of reducing them. Notably, when acknowledging David Baker's Nobel Prize-winning work with AI-generated proteins, the Nobel committee avoided the term entirely, instead referring to "imaginative protein creation". Hicks, Humphries, and Slater, in their article in Ethics and Information Technology, argue that the output of LLMs is "bullshit" under Harry Frankfurt's definition of the term, and that the models are "in an important way indifferent to the truth of their outputs", with true statements only accidentally true, and false ones accidentally false. Some researchers also use the derogatory term "botshit", often referring to uncritical use of AI. === Criticism === In the scientific community, some researchers avoid the term "hallucination", seeing it as potentially misleading. It has been criticized by Usama Fayyad, executive director of the Institute for Experimental Artificial Intelligence at Northeastern University, on the grounds that it misleadingly personifies large language models and is vague. Mary Shaw said, "The current fashion for calling generative AI's errors 'hallucinations' is appalling. It anthropomorphizes the software, and it spins actual errors as somehow being idiosyncratic quirks of the system even when they're objectively incorrect." In Salon, statistician Gary Smith argues that LLMs "do not understand what words mean" and consequently that the term "hallucination" unreasonably anthropomorphizes the machine. Murray Shanahan argues that anthropomorphic framing of LLM capabilities, including terms like "hallucination", encourages users and researchers to attribute cognitive processes to systems that operate through statistical pattern completion, and advocates for more careful linguistic practices when discussing LLM behavior. Kristina Šekrst argues that applying psychological vocabulary to LLM outputs obscures the difference between the appearance of mental properties and their genuine presence. Förster & Skop assert that tech companies use the hallucination metaphor to anthropomorphize models and deflect responsibility for non-factual outputs. Some see the AI outputs not as illusory but as prospective—that is, having some chance of being true, similar to early-stage scientific conjectures. The term has also been criticized for its association with psychedelic drug experiences. == In natural language generation == In natural language generation, there are several reasons why natural language models hallucinate: === Hallucination from data === Hallucinations can stem from incomplete, inaccurate or unrepresentative data sets. === Modeling-related causes === The pre-training of generative pretrained transformers (GPT) involves predicting the next word. It incentivizes GPT models to "give a guess" about what the next word is, even when they lack information. Some researchers take an anthropomorphic perspective and posit that hallucinations arise from a tension between novelty and usefulness. For instance, Amabile and Pratt define human creativity as the production of novel and useful ideas. By extension, a focus on novelty in machine creativity can lead to the production of original but inaccurate responses—that is, falsehoods—whereas a focus on usefulness may result in memorized content lacking originality. By 2022, newspapers such as The New York Times expressed concern that, as the adoption of bots based on large language models continued to grow, unwarranted user confidence in bot output could lead to problems. === Interpretability research === In 2025, interpretability research by Anthropic on the LLM Claude identified internal circuits that cause it to decline to answer questions unless it knows the answer. By default, the circuit is active and the LLM doesn't answer. When the LLM has sufficient information, these circuits are inhibited and the LLM answers the question. Hallucinations were found to occur when this inhibition happens incorrectly, such as when Claude recognizes a name but lacks sufficient information about that person, causing it to generate plausible but untrue responses. === Examples === On 15 November 2022, researchers from Meta AI published Galactica, designed to "store, combine and reason about scientific knowledge". Content generated by Galactica came with the warning: "Outputs may be unreliable! Language Models are prone to hallucinate text." In one case, when asked to draft a paper on creating avatars, Galactica cited a fictitious paper from a real author who works in the relevant area. Meta withdrew Galactica on 17 November due to offensiveness and inaccuracy. OpenAI's ChatGPT, released in beta version to the public on November 30, 2022, was based on the foundation model GPT-3.5 (a revision of GPT-3). Professor Ethan Mollick of Wharton called it an "omniscient, eager-to-please intern who sometimes lies to you". Data scientist Teresa Kuba