Algo IP: Intellectual Property in AI Datasets, Insights and Outputs – the Growing Importance of Trade Secrets
Whether a large cloud operator providing AI as a Service, a specialist AI developer licensing its AI software on premise to its customers, or a customer looking to capture the most value from the AI systems it uses, organisations are attributing increasing value to the data that their AI algorithms process. The data may be characterised as:
input data – training, testing and operational datasets input into the AI software;
that data as processed by the AI;
output data – from those processing operations; and
insights and data derived from the output data.
Legally inert in and of itself, an increasingly wide range of rights and duties arises in relation to data.[i] As regards IP, these are principally copyright, database right (in the EU), confidentiality and (since 2018) trade secrets. IP rights in relation to data are broad, as they are enforceable the whole world (‘in rem’), but shallow, as they are currently of uncertain scope and infringement is challenging to prove. On the other hand, contract rights relating to data are narrow, as they are only enforceable against a contracting party (‘in personam’), but deep, as breach is proved as a question of fact once the contractual obligation has been established. In practice, contract remains king in data land: the $30bn global financial market data industry has grown up over the last 40 years around stable contract norms with little litigation.
AI datasets and the law of confidence
Copyright and database right each protect the expression and form of information not its substance. This, coupled with the dynamic nature of AI data as inputs, outputs and insight, can make copyright and database rights challenging to apply to data. Equitable rules protecting confidentiality of information may however provide a better form of IP protection as they can protect from disclosure the substance of data that is not publicly known. Under UK law, protection extends in three steps. The first is the classic statement of the requirements for a successful breach of confidence action by Megarry J. in the 1969 case Coco v AN Clark (Engineers) Ltd:[ii]
“in my judgement, three elements are normally required if, apart from contract, a case of breach of confidence is to succeed. First, the information itself… must “have the necessary quality of confidence about it.” Secondly, that information must have been imparted in circumstances importing an obligation of confidence. Thirdly, there must be an unauthorised use of that information to the detriment of the party communicating it.”[iii]
To be protectible in this way, the information must therefore be shown both to be confidential and acquired under a duty of confidence.
In the second step, protection can extend to the aggregation of information even where parts of it are in the public domain and so not otherwise confidential. This is the result of a line of cases[iv] in the last three of which – called the ‘wireline’ cases – the information concerned was essentially in the public domain but the courts held that the structure of the information in its aggregated form was not and so was protectible as confidential.
As the duty of confidence goes to the substance, and not just the form, of data, the third step is that protection can extend to trace through to later generations of data derived from the initial confidential data, potentially through the mechanism of a constructive trust.[v]
AI datasets and trade secrets
The EU Trade Secrets Directive[vi] (‘TS Directive’) brings EU law more closely into line with Article 39 of the WTO TRIPS Agreement[vii] (which gives IPR protection to trade secrets as undisclosed information) and the US Uniform Trade Secrets Act[viii]. Article 2(1)(a) TS Directive sets out that a trade secret has three elements – secrecy, commercial value and steps taken to keep it secret. It defines a trade secret as:
“information which …
…is secret in the sense that it is not as a body or in the precise configuration and assembly of its components generally known among or readily accessible to persons within the circles that normally deal with the kind of information in question;
has commercial value because it is secret; and
has been subject to reasonable steps under the circumstances, by the person lawfully in control of the information, to keep it secret.”
The TS Directive came into effect in the UK on 9 June 2018 through the Trade Secrets (Enforcement, etc.) Regulations 2018 (‘TS Regs’).[ix] As to the ‘join’ between the TS Directive and the UK law of confidence, the Explanatory Notes to the TS Regs confirmed that:
“the issue of whether the acquisition, use or disclosure of a trade secret is unlawful is determined by reference to the principles of the law of confidence” (i.e. UK law deals with disclosure and breach);
“[w]here the measures, procedures and remedies available in an action for breach of confidence offer wider protection to a trade secret holder than that offered under the [TS Regs], the trade secret holder may apply for, and a court may grant, them provided they comply with [certain safeguards set out in Article 1 of the TS Directive]” (i.e if UK law gives broader rights, a claimant can invoke them).
In a study in 2017, the EU Intellectual Property Office (‘EU IPO’) found[x] that (i) the use of trade secrets to protect innovation was higher than patents for most companies, in most sectors and in all EU Member States; (ii) trade secrets were more likely to be used than patents in innovation in process and services; and (iii) trade secrets were preferred to patents in strongly competitive markets. In a legal environment where attaching IP rights to data is challenging, trade secrecy is therefore emerging as the most likely candidate right, especially in a more digitally connected, AI- and cloud- enabled world. This is because the area of trade secrecy is relatively structured and harmonised and interoperates benignly with national (common or equitable) laws of confidence. However, there remain significant challenges, with a number of key questions to be addressed:
how do you evidence ownership?
how do you show secrecy and prevent erosion in a world where big data is increasingly digitally accessible?
how do you identify a trade secret when, as in AI, dynamic and changing variables are a key feature of the algorithm or dataset?
what constitutes reasonable steps to keep data secret – must you apply standards similar to the GDPR and NIS Directive to take ‘appropriate technical and organisational measures’?
how do you document and keep trade secret records when the algo or dataset is dynamic and changing?
The next few years are likely to see sustained efforts to remove these perceived impediments to a robust and efficient trade secret legal regime.
Practical points to bear in mind
Market participants aiming to maximise their data IP rights should consider the following steps:
asserting (by contract and by website, documentation and other relevant notices) copyright, database right, confidentiality and trade secrets for input, output and insight data;
ensuring across all website and other notices and contracts that all relevant data is stated to be confidential and trade secret in order to minimise leakage;
documenting the secrecy of data, its commercial value and steps taken to keep it secret (perhaps by an ‘appropriate technical and organisational measures’ standard similar to GDPR and the NIS Directive) to maximise the availability of trade secret protection;
taking a contractual acknowledgement from the counterparty that information disclosed is confidential to the organisation, that the organisation is the trade secret holder and that the information is secret, has commercial value and has been subject to reasonable steps in the circumstances to keep it secret;
asserting in written methodologies and specifications that the way in which the contents of database(s) and dataset(s) concerned are selected and arranged is the product of the author’s own intellectual creation in order to maximise the likelihood of database copyright availability;
ensuring relevant documentation shows substantial ‘OVP-ing’ investment in collecting the data in the database as well as creating it so as to maximise the likelihood of database right availability;
considering the copyright position as a whole, taking into account literary copyright in information architecture and documents associated with the data;
taking effective assignments of present and future copyright and database right (and as necessary, trade secrets and confidential information) in all relevant contracts; and
reviewing the contractual definitions of:
confidential information so as to assess what data is included and ensure it covers trade secrets;
IP rights so as to assess whether confidential information and trade secrets are included; and ensure consistency of treatment between data as confidential information and data as IP rights; and
derived data to ensure that it aligns to your interests.
[iii] See ‘Gurry on Breach of Confidence’, second edition, OUP 2012, paragraph 2.139, page 70.
[iv] Albert (Prince) v Strange( 1 M&G 25); Exchange Telegraph Co. Ltd v Gregory & Co., ( 1 QB 147); Exchange Telegraph Co. LtdvCentral News Ltd ( 2 Ch 48); Weatherby & Sons v International Horse Agency and Exchange Ltd, ( 2 Ch 297). The last three are the ‘wireline’ cases.
[v] See ‘Gurry on Breach of Confidence’, supra, paragraphs 20.17 – 20.25, pages 787 – 790.
[vi] Directive 2016/943 of 8 June 2016 on the protection of undisclosed know-how and business information (trade secrets) against their unlawful acquisition, use and disclosure (OJ L157/2016) – https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016L0943&from=EN.