From Living Rooms to Courtrooms: Smart Speakers Under India’s Data Protection and Evidence Law

Ujjwal Gupta
Feb 21
10 min read

Updated: Feb 24

Introduction

Smart speakers have swiftly evolved from being novelty consumer gadgets to becoming a staple in many homes. In fact, as of 2025, there are over 8.4 billion voice assistant devices in use worldwide, which exceeds even the global population. India follows this pattern very closely. Nearly 20.9% of Indian consumers own smart speakers, and the Indian market is expected to reach $14 billion over the next ten years. These numbers suggest that voice assistants have evolved from mere side consumer products to mass-adopted technologies that are now in use in private homes, workplaces, and shared domestic environments. Moreover, unlike phones or wearables that require intentional human interaction to be turned on, smart speakers operate on an always-listening principle, as wake word detection entails continuous ambient audio capture.

This design tension has started drawing attention from regulators. In October 2025, French prosecutors opened an investigation into Apple’s Siri, following a complaint from the Ligue des Droits de l’Homme, which claimed that the user conversations were collected and analysed without valid consent. The investigation revealed that a major concern for regulators is that always-listening devices do not align well with consent-based data protection regimes. Although the case is brought under European law, its impact in India is significant. The Digital Personal Data Protection Act, 2023 (hereinafter ‘DPDP Act’), adopts an explicit consent-first model, and in certain aspects, is stricter than the General Data Protection Regulation (hereinafter ‘GDPR’). Besides, if the legality of consent for ambient voice capture is itself uncertain, the subsequent question will be: can the recordings made by such devices ever be safely used as evidence in the courtroom?

Against this backdrop, this article examines smart speakers as a regulatory and evidentiary problem under Indian law by, firstly, analysing their structural incompatibility with the consent, notice, purpose limitation, and child protection requirements of the DPDP Act; secondly, evaluating the admissibility, reliability, and constitutional implications of smart speaker recordings; and lastly, proposing targeted reforms to address these gaps.

Why Smart Speakers Struggle with Consent under the DPDP Act

A smart speaker can be described as a speaker with a built-in microphone that allows users to interact with other smart devices or internet services using their voice. The above definition, although simple, conceals a fact that may have significant legal implications: voice data is a type of personal data. According to Section 2(t) of the DPDP Act, any data about an individual who is identifiable by or in relation to such data is considered personal data. The human voice is so distinctive that it can identify a person directly or indirectly through the tone, speech patterns, language, or voiceprints. Therefore, the capturing and processing of voice data fall directly under DPDP.

Smart speakers operate on an always-listening architecture, always monitoring and processing voice data. This applies even when the device is owned by one user, but captures the voice of another individual in the hearing range. The device’s technical setup does not allow for the selective exclusion of non-users. There is no technical mechanism that can prevent the processing of a specific person’s voice or pause the collection until their identity or consent is confirmed. Moreover, research has found that smart speakers may accidentally record users upto 19 times a day without explicit instructions. This issue becomes particularly challenging in the case of shared domestic environments, where residents, visitors, domestic workers, and children live and are exposed to continuous recordings, but do not have a legal relationship with the Data Fiduciary.

The situation is particularly dire where the rights of children are concerned. Any processing of children’s personal data requires verifiable parental consent, as mandated by Section 9 of the DPDP Act, read with Rule 10 of the DPDP Rules 2025. Rule 10 goes even beyond. It obliges Data Fiduciaries to implement technical and organisational measures to ensure that the consenting person is an identifiable adult parent, by using reliable identity and age credentials, such as government-supported identifiers or digital locker tokens. This confirmation must occur before processing is initiated. Smart speakers were not designed to meet this criterion. They are not based on real-time identity verification and lack the capability to distinguish a child’s voice from that of an adult. Parental controls, if any, are account-based, post hoc, and non-verificatory, and thus, they are at a very considerable distance from the due diligence requirements of Rule 10.

Structural Non-Compliance with DPDP Obligations

Section 5 of the DPDP Act, read with Rule 3 of the DPDP Rules, mandates a strict notice obligation on data fiduciaries. This requires, among other things, notifying data principals of their personal data being collected and the purpose of its collection. Smart speakers also struggle here, as once installed, notices are generally given via companion apps, which are typically only accessible to the primary purchaser. Other residents, guests, or domestic workers who do not want their voices processed but have their data inevitably processed receive no notice at all. This undermines transparency and overall leads to non-compliance with the DPDP provisions.

This failure of transparency also leads to a breach of purpose limitation. According to Sections 4 and 6 of the DPDP Act, personal data should only be processed for a specific, lawful, and valid purpose, backed by consent. However, the cases of Apple’s Siri, Google Assistant, and Amazon Alexa continue to reveal a similar pattern, where voice recordings were accessed by contractors for grading and analytics purposes, which in some cases resulted in the capture of sensitive domestic or medical conversations. The grading program of Apple’s Siri, introduced in 2019 and reviewed by regulators in 2025, revealed that human review was bundled under very general terms, such as 'service improvement'. Similarly, the FTC’s 2023 lawsuit against Amazon showed that Alexa retained children’s voice data indefinitely and reused it for separate internal purposes even after deletion requests. DPDP requires that these secondary uses necessitate separate, granular consent. Turning one-time consent into blanket authorisation for any purpose breaches the purpose limitation.

From a theoretical perspective, this is a breach of ‘contextual integrity’ as per Helen Nissenbaum, which means that privacy is preserved when the flow of personal information conforms to the social norms and expectations of the specific context in which it occurs. For instance, a voice command for playing music or setting a reminder is a part of a domestic convenience context. Using the same data for human review or model training means that it has been shifted to a corporate optimisation context, which is regulated by different norms and expectations.

Finally, data minimisation and retention obligations under Rule 8 do not harmonise well with smart speaker architecture. Wake-word detection requires constant listening of ambient audio, and as a result, background conversations of non-users are also recorded. Accidental activations occur several times a day, which leads to further data collection. Moreover, business motives favour retaining data for a long time to train models and improve accuracy; however, the DPDP Rules instruct that data should be deleted when it is no longer needed for the specified purpose.

These breaches of compliance should not be seen as mere regulatory issues. They are, in fact, the primary factors that influence the legal reliability of smart speaker recordings. If voice data is captured without proper consent, sufficient notice, and a clearly defined purpose, then courts will be compelled to question whether such information can be safely used as evidence. This invites the key question: What is, then, the evidentiary value of smart speaker recordings?

Smart Speaker Recordings and Section 65B Admissibility

The first legal questions that smart speaker recordings raise are whether they can be used as evidence and whether they can be produced in court. The answer under Indian law begins with classification. Voice recordings produced by smart speakers are considered ‘electronic records’ within the meaning of Section 2(1)(t) of the Information Technology Act, 2000, since they are sound data stored, received, or transmitted in electronic form. Once they are so classified, such recordings constitute ‘documentary evidence’ within the meaning of Section 3 of the Indian Evidence Act, 1872 (hereinafter ‘IEA’). Their admissibility, however, is subject to Section 65B IEA (or Section 63 of the Bharatiya Sakshya Adhiniyam) alone, which serves as a separate code for proof of electronic evidence.

Initial court rulings seemed to allow some leeway. In State (NCT of Delhi) v. Navjot Sandhu, the Supreme Court held that electronic records might be proved by secondary evidence under Sections 63 and 65 IEA even without the production of a certificate under Section 65B. Later, in Anvar P. V. v. P. K. Basheer, the Court substantially reversed its position and held that Sections 63 and 65 cannot be invoked in respect of electronic records, and that compliance with Section 65B is mandatory. After that, the Court in Shafhi Mohammad v. State of Himachal Pradesh briefly departed from this and held that the certificate requirement is procedural and can be waived in cases where the party relying on the electronic record is not the one having control over the device. This dilution was flatly rejected by a three-judge bench in Arjun Panditrao Khotkar v. Kailash Kushanrao Gorantyal, which is the current position of law.

The bench in Arjun Panditrao, consisting of Rohinton Fali Nariman, S. Ravindra Bhat, V. Ramasubramanian, JJ., clarifies several issues directly relevant to smart speaker evidence. To begin with, a certificate under Section 65B(4) is a must-have and is a condition precedent to admissibility. Certification cannot be replaced by oral testimony. Meanwhile, the Court also allowed a narrow procedural accommodation by permitting parties, upon showing bona fide efforts, to obtain judicial directions for the production of the Section 65B(4) certificate if the person or authority in control of the device or system refuses to provide it, through the courts powers under Section 165 IEA, Order XVI Rules 6, 7 and 10 of the Code of Civil Procedure, 1908, and Section 91 of the Code of Criminal Procedure, 1973.

This directly points to the certification bottleneck caused by cloud architecture. Smart speaker recordings are not kept locally but are processed and saved on distributed cloud servers, which are operated by platforms like Amazon, Google, and Apple. For evidentiary purposes, such distributed systems are considered as one single computer network under Section 65B(3). Importantly, the user is neither the controller of the recording mechanism nor the receiver of the extraction process. Hence, the Section 65B certificate should be issued by the platform itself. Although courts may summon intermediaries under the Code of Criminal Procedure and the Evidence Act, the admissibility of platform cooperation remains a question. Besides, the issues of internal access, automated processing, and occasional human review give rise to serious concerns about the chain of custody.

While smart speaker recordings are admissible in theory as electronic evidence subject to Section 65B of the IEA, their use in practice is hindered by a veritable certification barrier. Because such recordings are generated, stored, and processed within cloud systems controlled by platforms, users lack control over said computer system for the purposes of issuing a Section 65B(4) certificate. Admissibility, therefore, becomes contingent on the cooperation of data fiduciaries, over whom private parties have no enforceable right of certification. Also, user access to recordings through companion applications does not remove this impediment, because such access is limited to curated outputs and does not give users any insight into internal processing or third-party access. Since Section 65B requires certification by a person responsible for the operation of the computer system itself, users, as data principals, remain unable to satisfy this requirement.

Therefore, smart speaker recordings are, in theory, allowed as evidence, but their weight is questionable. Unintentional activations are the main factor that weakens the clarity of context, speaker attribution, and inference of mens rea. It is true that smart speaker evidence is highly dependent on the platform, technically susceptible, and highly sensitive to the facts; thus, it requires rigorous judicial scrutiny at every stage.

Does Smart Speaker Evidence Violate Self-Incrimination or Privacy Rights?

The use of smart speaker recordings as evidence requires an examination under Article 20(3) of the Constitution, which shields an accused from being compelled to be a witness against themselves. As far as smart speakers are concerned, the recordings are typically made without the presence of the police or any kind of coercion. Statements are recorded because people voluntarily speak in the presence of the device, even if they are not aware that it has been accidentally activated. Not being aware of the activation does not mean compulsion. Therefore, the routine production of pre-existing smart speaker recordings does not, by itself, amount to compelled self-incrimination.

Furthermore, in Ritesh Sinha v. State of Uttar Pradesh, the Court ruled that voice samples are non-testimonial, similar to fingerprints or handwriting specimens. Most notably, there is now statutory support for this view in Section 349 of the Bharatiya Nagarik Suraksha Sanhita, 2023, which specifically authorises Magistrates to give directions for the collection of voice samples during investigations. Therefore, the use of smart speaker recordings for voice identification or as a means of proving similarity does not, in the normal course, violate Article 20(3).

Nevertheless, it is very important not to conflate Article 20(3) with Article 21. Article 20(3) deals only with compelled testimonial evidence and is limited to accused persons. Article 21, on the other hand, safeguards privacy and personal liberty and is available to all individuals. The case of Aasha Lata Soni v. Durgesh Soni held that recordings made without consent may be inadmissible on grounds of privacy. The difference is very important in the case of smart speakers. Although the device owner’s consent may be sufficient to weaken his Article 21 claim, other family members or guests whose voices are recorded without their knowledge or consent may still rely on the protection of privacy under Article 21.

The Way Forward

The DPDP Act situates the Data Protection Board of India at the core of regulatory oversight, endowed with the authority to conduct investigations, impose heavy monetary penalties, and issue directives for correction. In the case of large voice AI platforms, many of which are likely to be classified as Significant Data Fiduciaries, this may result in increased obligations, including data protection impact assessments and enhanced governance.

This tension calls for concrete solutions in terms of design and procedures. Regulators should require multi-layered, purpose-specific consent and implement Consent Managers that allow family members to independently give or withdraw their consent without relying on the primary account holder. Furthermore, parental consent should be strictly enforced rather than relying on post hoc parental controls. From a technical perspective, smart speakers should implement privacy-by-design defaults, should not retain raw ambient audio unless activation is confirmed, minimise retention, and allow human review only through explicit opt-in. Moreover, platforms should maintain reliable records regarding the storage and processing of recordings, so that courts can verify authentication and the chain of custody under Section 65B.

Ultimately, voice AI in India must navigate both consent law and evidence law simultaneously. Devices that circumvent either will still be vulnerable from a legal standpoint, whereas privacy-first systems that are adaptable have the highest likelihood of withstanding both regulatory and judicial scrutiny.

This article has been authored by Ujjwal Gupta, student at Dr. Ram Manohar Lohiya National Law University, Lucknow. This blog is part of RSRR's Rolling Blog Series.

RGNUL STUDENT RESEARCH REVIEW

From Living Rooms to Courtrooms: Smart Speakers Under India’s Data Protection and Evidence Law

Comments

Subscribe to RSRR