Developer

How to protect a proprietary algorithm

A proprietary algorithm should be treated like a secret recipe written in code: the value lies in the logic, the versions, the datasets, the tests, and the decisions that brought it to life. Documenting it properly helps reconstruct "what existed, when, and in what form". Before sharing it, integrating it into a SaaS, or showing it to partners and investors, prepare an organised dossier and certify its key versions.

1. How it usually happens

In software, the problem almost always starts in a trivial way: a demo, a technical call, a shared repository "just for a few days", a pitch deck sent to a potential partner, an integration test with an enterprise client. Then, months later, a very similar feature appears in someone else's product and everyone starts looking for the famous "previous version".

Proprietary algorithms are rarely a single magical file called final_true_definitive_algorithm.py, even though many projects have at least one folder with that kind of name, usually created at 2:17 AM. More often, they are a combination of source code, configurations, model weights, prompts, pipelines, technical documentation, benchmarks, notebooks, test datasets, logs, and architectural choices.

In the AI world, the situation is even more slippery: the value might lie in a ranking method, a fine-tuning strategy, a feature engineering system, a prompt chain, a specific way of evaluating outputs, or a combination of steps that, taken individually, seem harmless. It’s a bit like grandma's sauce: everyone sees the tomato and basil, but few know exactly when to add the salt.

An unusual perspective: even "failure" can be valuable. Discarded tests, worse metrics, unadopted models, and intermediate commits can demonstrate a coherent development path. In a dispute, the technical history counts almost as much as the final result.

2. What you need to prove

The point is not to magically prove that "the algorithm is yours", but to build credible documentation regarding the existence of certain versions, at specific times, with specific technical contents.

In practice, it can be useful to prove:

  • the existence of a specific version of the source code;
  • the presence of a certain algorithmic logic in a technical document;
  • the version of a model, a pipeline, or a prompt system;
  • the content of a demo sent to third parties;
  • the preparation date of benchmarks, tests, and results;
  • the state of a repository or a software package;
  • the content of emails, chats, or materials shared with partners, clients, or collaborators;
  • the evolutionary sequence from prototype to MVP to commercial version;
  • the access conditions under which the algorithm was shown or delivered.

The goal is to make the classic phrase "this idea was already in the air" much harder to use. Sure, the smart toaster was also "in the air" before someone actually connected it to an app, but in professional conflicts, you need files, dates, versions, and context.

3. What to collect

Prepare an organised documentary package, readable even by those who don't live inside your IDE.

Collect, when available:

  • original source files;
  • repository exported as a ZIP, preserving the folder structure;
  • changelogs and release notes;
  • technical documentation in PDF;
  • architecture diagrams;
  • notebooks, training scripts, configuration files;
  • prompts, internal policies, and operational chains used by the AI system;
  • test datasets or dataset descriptions, if shareable;
  • benchmark results and evaluation reports;
  • screenshots of the product, dashboard, or demo;
  • short videos showing how it works;
  • exported emails and chats related to submissions, demos, agreements, or feedback;
  • contracts, NDAs, quotes, commercial proposals, and pitch materials;
  • system logs or internal reports showing executions and versions;
  • a README file clearly explaining what the package contains.

A good habit: create an "index" file explaining what is in the package, why it is relevant, and which version of the project it refers to. It’s less glamorous than a generative model, but when facts need reconstructing, it’s worth its weight in gold.

4. How to proceed

Start by identifying the versions that truly matter. These are usually: the first working prototype, the version shown to third parties, the delivered or integrated version, the version before a major negotiation, the version before a potential dispute.

Create a clean folder for each version, using clear names and consistent dates, for example: B2B_Ranking_Algorithm_v0.9_2026-05-01. Put the code, documents, screenshots, reports, and a short README inside. Avoid mixing random notes, incomplete exports, and unexplained duplicate files: digital chaos is only funny in memes about cluttered desktops.

Practical procedure:

  • identify the key versions of the algorithm;
  • export original files without modifying them unnecessarily;
  • add a descriptive README;
  • create a ZIP archive for each relevant version;
  • also keep the most important individual files;
  • document any sharing with third parties via emails, chats, or minutes;
  • certify the main packages and documents;
  • archive everything securely in multiple copies;
  • record in an internal spreadsheet what was certified and why.

When you update the algorithm substantially, repeat the process. A single certification done at the beginning is useful, but an orderly sequence of versions tells the project's evolution much better.

5. Mistakes to avoid

The most common mistake is waiting for a problem to arise. When a dispute hits, files have already been renamed, compressed, moved to chats, re-saved by some tool, and turned into a small swamp of inconsistent metadata.

Other common mistakes:

  • certifying only a screenshot and forgetting the code;
  • saving only the final file without technical context;
  • modifying files after they have been certified;
  • using vague names like final.zip, new.zip, last_good_one2.zip;
  • sharing code or demos without a written trail;
  • ignoring contracts, NDAs, and access rules;
  • failing to separate what is proprietary from what derives from libraries, frameworks, or open-source components;
  • forgetting dependencies, configurations, and parameters necessary to understand how it works;
  • keeping everything on just one computer or in a single cloud account.

Besides technical certification, always consider organisational measures: access control, repository permissions, written agreements, internal policies, delivery tracking, management of external collaborators, and reviewing third-party licences. Free certification is useful because it allows you to immediately lock down an important version without turning every technical step into an administrative mini-project.

6. After the documentation

Once you have documented the algorithm, integrate this practice into the development cycle. Every major release should have a small "documentary snapshot": what changed, who worked on it, which files describe it, and where they are stored.

Involve the right people based on the context: technical leads, management, IP consultants, legal advisors, corporate advisors, contractual or insurance partners. If the algorithm is shown to investors, clients, or suppliers, prepare a shareable version beforehand and keep confidential materials separate.

In the event of a conflict or suspected misuse, calmly gather the evidence you already have, avoid impulsive file modifications, preserve communications, and seek professional support before sending formal accusations or demands. In software, as in cooking, the recipe counts; but when someone claims to have invented the exact same sauce, you also need to know who had the jar in their pantry first.