Tagging AI-generated images with SynthID

SynthID-Image is an effective post-hoc watermark, but it can sometimes be bypassed (e.g. by re-rendering) or made visible after a strong level correction.

Author: Radosław Kołacki Published: Updated: Technology

The question “is this an AI image?” Until recently, the decision could be quite simple because the generators left repeatable traces. You looked at the hands, the teeth, the text on the sign, whether the light and shadows „made sense”, and often got it right. This stage has passed. Not because people suddenly stopped being observant, but because the problem itself stopped being optical. Modern diffusion models are trained so that their results have the statistics of photographic images: textures, edge distributions, tonal transitions, even the „randomness” of details look more and more like in the training data. This means that the typical „by eye” today works more as a collection of circumstantial evidence than a verification tool. You can find an anomaly and be right, but just as often you will find a real photo, only heavily processed; or to an AI image that has gone through a grain filter, compression and recoding to JPEG so intense that the differences that were once screaming are now drowned in the usual noise of the Internet.

It is worth noting that the distribution of „visual errors” itself has changed in a way that spoils intuition. Where once spectacular mistakes dominated (an extra finger, letters pretending to be letters), today semantic errors are more common: something is illogical, but visually correct. And even more often, nothing is clearly wrong, it’s just that the image has a style that some people consider „too perfect” and others consider it the result of a good lens and color grade. These types of impressions are susceptible to context and bias. If someone „knows” that the image may be from AI, they start to see symptoms; if he doesn’t know, he often doesn’t notice them. This is not a solid basis for conclusions.

If you want something that is repeatably defensible, you need to shift the question from „do I see it?” to “do I have a verifiable trace of origin?”. In practice, there are three classes of such traces, with different evidentiary weight. The first is provenance: information about how and with what the file was created and edited, often in the form of signed metadata or a chain of credentials (standards such as C2PA are a good reference point here, but not the only one). This can be the strongest form of evidence because it doesn’t try to „guess” the story from the pixels. But it has an obvious problem: provenance often gets lost along the way. All it takes is a screenshot, re-upload to a social media site, export to another format, and the nice origin story disappears. The second class is watermarking, i.e. a situation in which the image itself carries a hidden signal in the data that can be detected with a tool. This is where SynthID comes in: the idea is simple, although the implementation can be complex, an invisible „signature” structure is added to the image so that you can later check whether this signature is present. This is much more „hard” than visual heuristics, but has a fundamental limitation: it only works if a watermark has been applied. Lack of detection does not mean „it’s not AI”; only means „this particular character was not found”. The third class is statistical detectors: models that attempt to distinguish AI from non-AI images based on distributional features. They can be useful as an auxiliary signal, but they have their own set of problems: the generator evolves and the detector falls behind (drift), or the image comes from a different domain than the one on which the detector was trained (domain shift). Then false alarms and omissions increase, often in ways that are difficult to predict without painstaking calibration.

The most prudent practice is therefore not to find one „magic test”, but to layer. First you look for provenance, because that’s the best context if it has survived. Then you check whether the image carries a watermark from a specific ecosystem (SynthID is an example of this approach). Only at the end do you activate your eyesight and statistical detectors as support, not as a final judgement. This hierarchy is important because it organizes the rest of the discussion: since „by eye” is becoming less and less decisive, it is worth understanding where the realism of diffusion comes from, and then how precisely SynthID-Image can embed in an image a signal that cannot be seen, but can be detected, even after compression and online processing.

Diffusion in a nutshell

To understand why „AI vs. non-AI” detection has become so ungrateful, you need to go down a level for a moment and see what diffusion models actually do. Their basic trick is surprisingly elegant: instead of teaching the network to „draw” an image from scratch, we teach it to reverse the image destruction process. This is a practical difference, not a philosophical one. So in the standard description we haveprocess forward(forward/noising) ireverse process(reverse / denoising).

In the forward process, we take a real image and add Gaussian noise to it in multiple steps. Aftervolvolt steps we get, increasingly resembling pure noise. The key is that this process is structured to be mathematically simple: typically you choose a variance schedule (how much noise we add in a stepvolvolt) and defines Markov transitions. Then you can show thatxvolx_tcan be written directly as a mixture of signal and noise: in simple terms, „a little image + a little Gaussian”. This means that the model has a very clear goal when training: it receives a noisy example and is supposed to reproduce information about what noise was added or what the cleaner image looked like.

The reverse process is what happens during generation. We start with virtually pure noise, and perform a series of denoising steps. At each step, the network receives the current state and information about the given step and predicts something like „how much of this signal is noise”. In the most popular parameterization, the network predicts the noise estimate, and the sampler uses it to calculate the next, less noisy state.

Where is the „prompt” and conditional generation in all this? This is the second, equally important specific: most modern diffusions areconditioned(conditional). Most often, this is done through the cross-attention mechanism: the text (or other condition) is encoded into vectors, and U-Net „consults” this code during denoising so that the image moves in the direction consistent with the description. Technically, this means that the sampling distribution is not „any image from the data”, but „an image from the data that matches the condition”. And here comes an important point for detection: since the model is trained on huge sets of real images, and then in a reverse process reconstructs the sample in a way consistent with this distribution, the result has very natural statistics. Not because the model „understands the world”, but because it has learned to imitate the visual regularities present in the data very well.

Finally, one thing that often gets missed in popular descriptions:many systems do not work directly with pixels, only in the so-calledlatent space. In latent diffusion, the image is first compressed by an autoencoder (VAE) into a lower-dimensional representation, and diffusion occurs in this latent. Only at the end does the VAE decoder map the latent back to RGB. This speeds up generation and reduces costs, but it also has consequences: part of the image „structure” is shaped in the latents, and part in the decoder. Therefore, when we talk about watermarking later, we must always specify whether the mark is embedded „on the fly” in latents/sampling orpost-hocon ready pixels. This distinction is the basis for a sensible conversation about SynthID-Image, because this solution does not have to interfere with the diffusion process itself: it can operate on a ready-made image, regardless of the model with which it was generated.

This all sounds technical, but it has a simple consequence: diffusion produces images that are „statistically reliable” because they are created by controlled denoising in the direction of the data distribution. Therefore, simple hunting for visual errors loses its power, and the importance of methods that do not try to guess from appearance, but rely on a verifiable signal embedded in the data itself – and here we smoothly move to SynthID.

„At a glance” what works, what doesn’t, and why it can’t be the foundation

Now that we know that diffusion can produce images with very „photographic” statistics, it is easier to understand why the human eye is increasingly losing out. However, it is worth not going to extremes: visual assessment is still useful, but it must be treated as suchpreliminary tool, not as an evidentiary method. This is the difference between „I have a suspicion” and „I have verification.”

The most classic repertoire of suspicions concerns places where the image is sensitive to semantics and geometry at the same time. Hands and fingers have long been a weak point because they require anatomical consistency (number of fingers, shape of joints) and lighting consistency (shadows between fingers, reflections on nails). Similarly, the text: the letters are particularly merciless, because a person recognizes them categorically either it is „A” or not. Especially in Polish. Finally, light and reflections: window, metal, glass, mirror can expose the model if the reflections do not match the position of the objects.

The problem is that these cues are naturally „pushed” out of the models. If thousands of users start pointing out the same error, there is pressure to fix it: either through better training, or through a post-generation filter, or through supporting tools (e.g. automatic correction of faces, hands, text). The effect is that a heuristic that was great two years ago is much less effective today and there is no guarantee that it will be effective tomorrow.

There is also a second reason, often more important in practice: the image you see on the web is almost never the „clean” output of the model. It goes through compression, scaling, sharpening, denoising, added grain, sometimes a color filter, and sometimes several such stages one after the other. This not only masks possible AI artifacts, but also adds them to real photos. If someone uploads a photo from the phone after heavy night denoising, sharpens it, and the platform additionally recodes it to JPEG, the effect may look „unnaturally smooth” and it is easy to cause a false alarm. Similarly, if someone takes an AI image, adds film grain and puts it through compression, they get „believable dirt” that many people mistake for authenticity.

Therefore, when assessing „by eye”, it is worth separating two things:structural artifacts(inconsistencies of geometry, physics, semantics) iprocessing artifacts(JPEG, sharpen, denoise, grain). The latter are treacherous because they are everywhere and say nothing about their origin. And the first ones, although sometimes strong, are increasingly rare in models and increasingly „fixed” in pipelines.

In practice, you can treat vision as a first-pass filter: if something is clearly illogical or physically impossible, it is a good clue. But if the image is just „nice” or „too smooth”, it’s not evidence, just a feeling. And here we come to the point: since appearance is, at best, a soft premise, we need something that can be checked regardless of taste, compression and context – i.e. the signal in the data – to decide. This is exactly what the two families of solutions that make the most sense today do:provenance(if maintained) iwatermarking(if used). And among the watermarks, the most interesting one, because it was designed „on the scale of the Internet”, is the Google DeepMind approach: SynthID, and in particular SynthID-Image.

In the next chapter, we will move from the general idea to the mechanics: what the encoder-detector architecture looks like, what a payload is, how resistance to JPEG and cropping is learned, and how at the pixel level we can imagine an „invisible” change that the detector can still detect.

SynthID and SynthID-Image: technical core (post-hoc) + mini-demo on pixels

In practice, „SynthID” is not just one trick algorithmwatermarking + detection system, designed to work in real Internet circulation: images are compressed to JPEG, scaled, cropped, passed through filters and re-uploaded many times. Google DeepMind describes it directly as a compromise of three things at once: the sign has to beinvisible,detectableandresistant to typical modifications. In the version for images (SynthID-Image), there is one key sentence that organizes the entire architecture: this approachpost-hoc and model-independent, the mark is superimposedfor the finished paintingby a separate module (encoder) and then detected by the corresponding detector/decoder. This is not „modifying the diffusion process on the fly.” [1]

This “post-hoc” is more important than it sounds. It means that the pipeline looks something like this: the generator (diffusion or whatever) produces the final image, and only then comes inSynthID-Image encoder, which adds a very small, precisely selected perturbation to the image. This perturbation is not random „change every tenth pixel” noise, but the result of training: DeepMind describes SynthID as a solution based ontwo deep learning models(watermarking and identification), trained together to increase detection on the one hand and minimize visibility on the other, including: by „visually matching” the sign to the content. [2] Paper SynthID-Image formalizes this problem as „shifting the distribution of images in a way detectable by the detector”, but while maintaining quality constraints, and the quality is verified not only by metrics, but also by research with people. [1]

The second thing that distinguishes SynthID-Image from many popular watermarking descriptions ispayload. The system is not only supposed to respond „there is/is not a mark”, but also to provide multi-bit information (e.g. for versioning and organizational management of watermarks). The authors emphasize that payload is to be resistant to „daily” content changes, and in their approach, the problem of detection and payload is solved consciously (in the paper they describe them as separate requirements and discuss why payload is crucial for provenance in practice). [1]

The third thing isdetection as a probabilistic result, not a judgment. In operational terms, the detector gives the result you needcalibrate(selection of the threshold for the target level of false alarms). [1] In public communication, DeepMind already in 2023 wrote about the interpretation of the result through levels of confidence (e.g. three levels of confidence). [2] This is an „engineering” approach: the tool is supposed to work at scale and have a controlled FPR, and not pretend to be a 100% oracle.

There is also „internet scale” in the background: the paper explicitly says that SynthID-Image was used for watermarkingover ten billionimages and video frames in Google services, and the verification service was available to trusted testers. [3] In the product layer, Google also describes a detection portal (SynthID Detector), which is supposed to detect characters for various modalities and be able to indicate fragments of content that are most likely to be watermarked. [4]

Mini-demo: what „micro changes” look like at the pixel level

To feel it practically, let’s taketruea 4×4 pixel cutout from brown cardboard on a mug (smooth, light surface – a place where the sign must be particularly non-invasive and delicate). Below is the color matrix in HEXread directly from the file.

Original photo

Example of an original small 4×4 px area

#C6B19C  #C7B29D  #C8B3A0  #C8B3A0
#C6B19C  #C7B29D  #C7B29F  #C8B3A0
#C7B19C  #C8B29D  #C7B29F  #C8B3A0
#C7B19C  #C9B39E  #C8B3A0  #C9B4A1

Theoreticallywithout access to a real SynthID-Image encoder it should not be possible to generate a “real” watermarkfor file. However, we can show you what it looks likescaleandcharacterthe changes we are talking about – micro-shifts of the order of single units in the RGB channels. So below isillustrative„watermarked” version (toy demo), where we change the selected pixels minimally (e.g. +1 in one channel, −1 in another), so that the color remains visually almost the same.

Illustration „after watermarking” (4×4, toy demo)

#C7B09C  #C6B39D  #C9B2A0  #C8B3A1
#C5B29C  #C8B19D  #C7B39E  #C9B2A0
#C7B09D  #C8B39C  #C6B39F  #C8B3A1
#C8B09C  #C8B49D  #C9B2A0  #C9B4A0

What does „engineering” show, even in such a small sample? Type change#C6B19C → #C7B09Cmeans that the red channel increased by 1, the green channel decreased by 1 and the blue channel remained unchanged. To the eye it will still be the same warm shade of beige, especially in the context of neighboring pixels, lighting and the entire scene. But what matters to the detector is not a single pixel, but thatsuch microchanges form a diffuse, statistically coherent signalspread over many parts of the painting.

And here is the most important intuition: a SynthID-Image style detectordoes not “read the sign” from one place, nor is it looking for a simple pattern like „youngest bit in every tenth pixel”. In the solutions described by Google DeepMind, the idea is for the encoder to learn to introduce perturbations that are (a) minimal perceptually, but (b) still detectable after typical transformations, such as JPEG compression, scaling or cropping, which is why training involves simulating such transformations in the learning loop. [1] In public descriptions of SynthID, there is also a practical aspect of interpreting the result: the detection tool usually does not provide an „absolute verdict”, but a result in the form of a confidence level/score, which is calibrated for a reasonable compromise between false alarms and omissions. [2]

So this mini-example has one purpose: to show that at the hex level you only „see” microscopic shifts, which in themselves prove nothing. Only when they are generated by a trained encoder in a coordinated manner and checked by the corresponding detector do they become a verifiable signal of origin. Therefore, in practice, you do not „see” SynthID visually or through individual pixels,it is checked with a tool, which knows what type of distributed structure it is looking for, i.e. it is able to „decode” this character.

What does SynthID look like in practice?

It is worth adding one caveat here: the resistance of SynthID-Image is designed primarily for typical distribution transformations (compression, scaling, cropping), but there is a separate class of actions that are not ordinary image editing, butgenerative re-rendering– repeated, controlled re-synthesis that preserves the semantics and composition of the scene, and at the same time can overwrite the low-level „layer” of pixels carrying the watermark. In the repositorySynthID-Bypass [9]the authors describe a proof-of-concept in which such processing reduces the detectability of the sign and show an interesting „diagnostic” step using Comfy.

Example of a Comfy workflow trying to reduce the detectability of a sign, source https://github.com/00quebec/Synthid-Bypass

A very similar intuition can be seen in the example I prepared below: after generating a black square#000000and manipulation of levels, using aggressive tonal correction, it was possible to „extract” a structure with encoded elements in the file – invisible in the normal preview, but present in pixel micro-differences. This is, of course, not an attempt to „overthrow” SynthID, but only to clarify the boundaries: SynthID-Image is a watermark focused on the scale of the Internet and a specific „threat model”, and in the case of an active opponent (especially using generative „washing” of content), one must take into account that the detection rate may drop – which is directly related to the discussion about security and compromises (fidelity/robustness/security) at work SynthID-Image.

Example of tonal correction of a black square

Contrast: post-hoc (SynthID-Image) vs in-process (watermark „on the fly”)

In practice, watermarking in generative images is divided into two approaches that sound similar („add watermark”), but are technically and operationally completely different systems. The first is thispost-hoc, i.e. applying a mark to the finished image. The second one isin-process, i.e. „weaving” the sign into the generation process itself, step by step.

Post-hoc, as in SynthID-Image,is conceptually simpler: the generative model does its job, produces the final image, and only then a separate module (encoder) introduces micro changes in pixels that are invisible to the eye, but recognizable by the corresponding detector. The biggest advantage of this approach is very engineering:no need to touch the generator. This is what „model-independent” means, the same watermarking scheme can be connected to different generators because it works on their output. From an implementation perspective, this is a huge win: you can develop marking and detection as a separate component, version it, test it with new transformations (JPEG, crop, resize) and update it without rebuilding the entire generating model. The SynthID-Image publication also shows that post-hoc is easier to treat as an „internet-scale product”: you calibrate the detector, ensure a low percentage of false alarms, and build immunity through training with augmentations simulating online processing.

The price of this convenience is equally specific:the encoder cannot count on the generator to „smooth” the perturbation. Now that the image is ready, the encoder must choose the changes very carefully so as not to leave any visible traces, while still maintaining detectability after compression and edits. This often leads to a harder „quality vs. resistance” trade-off: if you want a sign to survive more aggressive transformations, you have to introduce it more strongly, and this increases the risk of perceptual artifacts. Post-hoc is also, by definition, more vulnerable to certain classes of „remove character by strong degradation” attacks, because the attacker does not have to fight the generation mechanics, because he only needs to grind the image enough for the detector to lose the signal (which, of course, usually also ruins the quality).

In-process, i.e. watermark „on the fly”,it works differently: the sign is not added after the fact, but is part of the generation trajectory. In diffusion models, this usually means that during subsequent denoising steps, the system introduces controlled, small preferences in how the image is created, so that the final result „carries” the signal. Technically, this can be implemented in several ways: by modifying how subsequent noise samples are selected, by controlling certain features in the latents, or by introducing an additional objective/regularization that enforces the presence of an easily detectable signature. There are schemes in the literature that do this directly in the diffusion process (e.g. the „tree-ring” family of approaches or other latent watermarking approaches), but the common idea is the same: the generator and the watermark are intertwined.

The advantage of in-process is that the watermark can be „more organic” to the generated image and is not an overlay on the finished image, but results from how the image is created. In certain settings, this provides greater resistance to some transformations, because the signal is more closely related to the structure of the generated features. Additionally, by having control over the entire generation process, you can design the watermark from the beginning with specific channels (e.g. frequency) in mind and better balance it with perception.

But the price is high and very practical:you need to have control over the generator. In-process is not an „overlay”, just a change in the behavior of the model or sampler, so:

  • it is more difficult to implement in many models at once,
  • it is more difficult to update without affecting the quality and distribution of results,
  • and in the case of external models (where there is no sampler control) this is simply impossible.

In addition, in-process complicates the evaluation more. Since watermark is part of generation, the question changes: is the quality degradation due to the watermark or a different sampling configuration? How to compare fair models? Post-hoc gives a cleaner experiment: the generator is constant, and watermarking is simply a separate module.

Summary

It seems that for todaypost-hoc wins operationally, ain-processit remains scientifically attractive and perhaps in the longer term it will become a more „qualitative” form of watermarking in the sense that the mark will be more deeply integrated into the generation mechanics, rather than added at the end. However, post-hoc (like SynthID-Image) has an advantage that in practice determines its choice: it iseasy to standardize and scale. You can treat them as an independent infrastructure component, connect them behind any generator, develop the encoder and detector in a separate cycle, tune the resistance to real transformations (JPEG, resize, crop, filters), calibrate the thresholds for a low percentage of false alarms and update the system without having to „dig” the generating model itself. This matters if the goal is broad applicability: repeatability, risk control, compatibility with various pipelines, and the ability to maintain the system for years even though the generating models change.

In-process can offer interesting trade-offs, potentially more „organic” signal embedding and resistance to some classes of attacks, because the watermark is related to the generation trajectory, and not only to the final image. However, it comes at a high price:requires full control over generation(modeling and sampling), is more difficult to implement across many different models, and updates can impact the quality and distribution of results, complicating both evaluation and maintenance in production.

In other words:in-processit can be great where the entire technology stack is under control and stable, but it becomes problematic when trying to build a solution for a wider audience that is supposed to work across different products, on different models and in conditions of constant change. This is why large companies willingly choose post-hoc approaches as a layer at the end of the entire process, because they provide the best ratio of implementability to benefits, and maintain in-process research as an alternative line of development that may prove crucial in closed environments or in future systems where control over the generator is full and there is no need to pay high integration costs.

Bibliography

[1] S. Gowalet al., „SynthID-Image: Image watermarking at internet scale„,arXiv preprint arXiv:2510.09263, 10 Oct 2025, DOI: 10.48550/arXiv.2510.09263.

[2] Google DeepMind, “SynthID” (product page / general documentation), accessed: Jan 14, 2026.

[3] Google AI, “Advancing AI safely and responsibly” (section: SynthID / SynthID Detector), accessed: Jan 14, 2026.

[4] Coalition for Content Provenance and Authenticity (C2PA), “C2PA Technical Specification v1.3„, March 29, 2023.

[5] Y. Wen, J. Kirchenbauer, J. Geiping, T. Goldstein, „Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust„,arXiv preprint arXiv:2305.20030, May 31, 2023.

[6] P. Fernandezet al., „The Stable Signature: Rooting Watermarks in Latent Diffusion Models„,ICCV 2023(paper), also:arXiv preprint arXiv:2303.15435, 2023.

[7] H. Huanget al., „ROBIN: Robust and Invisible Watermarks for Diffusion Models…„,NeurIPS 2024 Proceedings(paper).

[8] „Guidance Watermarking for Diffusion Models„,arXiv/OpenReview, 2025 (on-the-fly watermarking via guidance).

[9]00quebec, „SynthID-Bypass” (README), GitHub, accessed: 21/01/2026.