LCI Learning

Share on Facebook

Share on Twitter

Share on LinkedIn

Share on Email

Share More


The idea that someone could be identified by the sound of his voice had its origins in the work of Alexander Melville Bell (father to Alexander Graham Bell).  Over one hundred years ago, he developed a visual representation of what the spoken word would look like.  It was based on pronunciation and he showed that there were subtle differences among different people who said the same things. Then in 1941, the laboratories of Bell Telephone in New Jersey produced a machine—the sound spectrograph—for mapping a voice onto a graph.  It analyzed sound waves and produced a visual record of voice patterns that were based on frequency, intensity, and time.  Acoustic scientists used it during World War II to identify enemy voices on telephones and radios.  However, with the war's end, the urgency for this technology diminished and little came of it until later.

The forensic science of voice identification has come a long way from when it was first introduced in the American courts back in the mid 1960's. In the early days of this identification technique there was little research to support the theory that human voices are unique and could be used as a means for identification. There was also no standardization of how identification was reached, or even training or qualifications necessary to perform the analysis. Today voice identification analysis has matured into a sophisticated identification technique, using the latest technology science has to offer. The research, which is still continuing today, demonstrates the validity and reliability of the process when performed by a trained and certified examiner using established, standardized procedures. Voice identification experts are found all over the world. No longer limited to the visual comparison of a few words, the comparison of human voices now focuses on every aspect of the words spoken; the words themselves, the way the words flow together, and the pauses between them. 


The sound spectrograph, an automatic sound wave analyzer, is a basic research instrument used in many laboratories for research studies of sound, music and speech. It has been widely used for the analysis and classification of human speech sounds and in the analysis and treatment of speech and hearing disorders.

The instrument produces a visual representation of a given set of sounds in the parameters of time, frequency and amplitude. The analog spectrograph is composed of four basic parts; (1) a magnetic tape recorder/playback unit, (2) a tape scanning device with a drum which carries the paper to be marked, (3) an electronic variable filter, and (4) an electronic stylus which transfers the analyzed information to the paper. The analog sound spectrograph samples energy levels in a small frequency range from a magnetic tape recording and marks those energy levels on electrically sensitive paper. This instrument then analyses the next small frequency range and samples and marks the energy levels at that point. This process is repeated until the entire desired frequency range is analyzed for that portion of the recording. The finished product is called a spectrogram and is a graphic depiction of the patterns, in the form of bars or formants, of the acoustical events during the time frame analyzed. Recent developments in sound spectrography have produced computerized digital sound spectrographs ranging from dedicated digital signal analysis workstations to PC-based systems for acquisition, analysis editing, and playback. These sophisticated computer-based systems provide high fidelity signal acquisition, high- speed digital processing circuitry for quick and flexible analysis, and CD-quality playback. The accuracy and reliability of the sound spectrograph, either analog or digital, has never been in question in any of the courts and never considered an issue in the admissibility of voice identification evidence.


The method by which a voice is identified is a multifaceted process requiring the use of both aural and visual senses. In the typical voice identification case the examiner is given several recordings; one or more recordings of the voice to be identified and one or more recorded voice samples of one or more suspects. It is from these recordings the examiner must make the determination about the identity of the unknown voice.

The first step is to evaluate the recording of the unknown voice, checking to make sure the recording has a sufficient amount of speech with which to work and that the quality of the recording is of sufficient clarity in the frequency range required for analysis.1 The volume of the recorded voice signal must be significantly higher than that of the environmental noise. The greater the number of obscuring events, such as noise, music, and other speakers, the longer the sample of speech must be. Some examiners report that they reject as many as sixty percent of the cases submitted to them with one of the main reasons for rejection being the poor quality of the recording of the unknown voice. Once the unknown voice sample has been determined to be suitable for analysis, the examiner then turns his attention to the voice samples of the suspects. Here also, the recordings must be of sufficient clarity to allow comparison, although at this stage, the recording process is usually so closely controlled that the quality of recording is not a problem.

The examiner can only work with speech samples which are the same as the text of the unknown recording. Under the best of circumstances the suspects will repeat, several times, the text of the recording of the unknown speaker and these words will be recorded in a similar manner to the recording of the unknown speaker. For example, if the recording of the unknown speaker was a bomb threat made to a recorded telephone line then each of the suspects would repeat the threat, word for word, to a recorded telephone line. This will provide the examiner with not only the same speech sounds for comparison but also with valuable information about the way each speech sound completes the transition to the next sound.

As in any other form of identification analysis, as the quality of the evidence with which the examiner has to work declines, the greater the amount of evidence and time necessary to complete the analysis, and the less likely the chance for a positive conclusion.

Once the evidence has been determined to be sufficient to perform the analysis, the examiner then begins the two step process of voice sample comparison; one aural (listening) and the other spectrographic (visual).

In the case of aural comparison of the voice samples the examiner compares both single speech sounds and series of speech sounds of the known and unknown samples. Once the examiner has located those portions to be used for the analysis, a more detailed aural comparison is undertaken. This comparison can be accomplished in many different ways. One of the most commonly used methods of aural comparison is re-recording a speech sound sample of the unknown followed immediately by a re-recording of the same speech sounds of the suspect. This is repeated several times so that the final product is a recording of specific speech sounds, in alternating order, by the unknown speaker followed by the suspect. Such comparisons have been greatly facilitated by the use of audio digital recording equipment which allows for the digital recording, storage, and repeated playback of only the desired speech sounds to be examined. During the aural comparison the examiner studies the psycholinguistic features of the speakers voice. There are a large number of qualities and traits which are examined from such general traits as accent and dialect to inflection, syllable grouping and breath patterns. The examiner also scrutinizes the samples for signs of speech pathologies and peculiar speech habits.

The second step in the voice identification process is the spectrographic analysis of the recorded samples. The sound spectrograph is an automatic sound wave analyzer with a high quality, fully functional tape recorder. The speech samples to be analyzed are recorded on the sound spectrograph. The recording is then analyzed in two and one half second segments. The product is a spectrogram, a graphic display of the recorded signal on the basis of time and frequency with a general indication of amplitude.

The spectrograms of the unknown speaker are then visually compared to the spectrograms of the suspects. Only those speech sounds which are the same are compared. The examiner looks not only for similarities but also for differences. The differences are closely examined to determine if they are due to pronunciation differences or if they are indicative of different speakers.

When the analysis is complete the examiner integrates his findings from both the aural and spectrographic analyses into one of five standard conclusions;

1. a positive identification                                                                                                2. a probable identification                                                                                                3. a positive elimination                                                                                        4. a probable elimination or                                                                                             5. no decision.

In order to arrive at a positive identification the examiner must find a minimum of twenty speech sounds which possess sufficient aural and spectrographic similarities. There can be no differences either aural or spectrographic for which there can be no accounting.  The probable identification conclusion is reached when there are less then twenty similarities and no unexplained differences. The result of positive elimination is rendered when twenty differences between the samples are found that can not be based on any fact other than different voices having produced the samples. A probable elimination decision is usually reached when working with limited text or a recording of lower quality. The no decision conclusion is used when the quality of the recording is so poor that there is insufficient information with which to work or when there are too few common speech sounds suitable for comparison.

The contribution of  Lawrence G. Kersta                                                                                                                

Voiceprint technology began to get notice for criminal investigations in the early 1960s when the New York City Police Department received numerous bomb threats by phone against major airlines.  The FBI asked Bell Labs to help. Lawrence G. Kersta, one of their senior engineers was assigned the task of figuring out a method of identification that would stop the calls and bring the perpetrators to justice.  He was a physicist, who had worked with the sound spectrograph in its early days.  It took him more than two years and the analysis of over 50,000 voices, but he managed to offer a technique that he claimed tested at 99.65% accuracy. 

Lawrence Kersta noted that each person's voice has a unique quality that can be mapped on a graph.  One person's vocal chords, no matter how similar they might look, process sounds differently than someone else's.  The size and shape of someone's vocal cavity, tongue, and nasal cavities contribute to this, as well as how that person coordinates lips, jaw, tongue, and soft palate to make speech.  No combination of these things is like any other.  That means that our voices are sufficiently unique to make personal identification based on voice sounds possible.   Then in 1966, the Michigan State Police formed a Voice Identification Unit and hired Lawrence Kersta to train these officers. 

California came to a similar holding when the issue first reached the appellate level in People v. King.15 The State brought in Lawrence Kersta as the voice identification expert to testify as to the reliability of the technique. The defense brought in seven speech scientists and engineers to rebut Kersta's claims. The court held that "Kersta's claims for the accuracy of the `voiceprint' process are founded on theories and conclusions which are not yet substantiated by accepted methods of scientific verification".

Admissibility of Evidence

Voiceprint technology came into the American courts in the 1960s, and judges were divided on whether or not to admit it as scientific evidence.  The first court of published opinion to rule on the admissibility of voice identification analysis was in the case of United States v. Wright, 17 USCMA 183, 37 CMR 447 (1967). This was a court martial proceeding in which the appellate court affirmed the admission of spectrographic voice identification evidence by the board of review. The New Jersey Supreme Court was the first non-military court to make an appellate review, in State v. Cary.   Courts in New York and California had admitted this type of testimony, so the New Jersey justices remanded the case to check the accuracy of the equipment.  In State ex rel. Trimble v. Heldman 19, the Supreme Court of Minnesota held that "spectrograms ought to be admissible at least for the purpose of corroborating opinions as to identification by means of ear alone".

In 1976 the New York Supreme Court pointed out, in the case of People v. Rogers, that fifty different trial courts had admitted spectrographic voice identification evidence, as had fourteen out of fifteen U. S. District Court judges, and only two out of thirty- seven states considering the issue had rejected admission. The Rogers court stated that this technique, when accompanied by aural examination and conducted by a qualified examiner, had now reached the level of general scientific acceptance by those who would be expected to be familiar with its use, and as such, has reached the level of scientific acceptance and reliability necessary for admission.

The Supreme Court of Pennsylvania rejected admission in Commonwealth v. Topa holding that the technician's opinion alone will not suffice to permit the introduction of scientific evidence into a court of law. This was the same situation, in fact the same single expert, which confronted the Kelly court.

In February of 1989, the United States Court of Appeals for the Seventh Circuit affirmed the decision of the United States District Court for the Northern District of Illinois admitting spectrographic voice identification evidence in the criminal case of United States of America v. Tamara Jo Smith. In United States v. Maivia,132 the United States District Court admitted spectrographic evidence after a four day hearing on the issue. In affirming the order of the Appellate Division, the New York Supreme Court, in the case of People v. Jeter, concluded that the trial court was not able to properly determine that voice identification evidence is generally accepted as reliable based on case law and existing literature. As voiceprint technology and interpretations become more widely used, increasingly more courts are responding favorably to it.  Nevertheless, it remains an unsettled question, in particular when courts review older decisions based on the technology in its early stages.                                                                     

Clifford Irving case

Lawrence Kersta believed that an individual's voice does not change over his or her lifetime, other experts have disputed him on this point.  If the body changes, so does the voice.  Even where a person lives can effect voice changes, as well as illness, stress, aging, and other factors.  Nevertheless, Kersta maintained that the essential qualities of the voice remain constant.  He felt that he finally proved this in one of the most famous cases involving the spectrograph: that of the reclusive Howard Hughes.

In 1971, a man named Clifford Irving came to New York to cut a deal for what he claimed was Hughes' autobiography, ghosted by him.  He had letters that he insisted were written by Hughes and experts soon authenticated them.  The publisher McGraw-Hill bought into his claim, advancing him $765,000 and announcing their intent to publish the book.  Eventually Irving turned in a 1200 page manuscript. It was difficult to ascertain whether Hughes had actually authorized this transaction since for the past fifteen years he had been exceedingly elusive.  That Irving had letters from him seemed a good indicator that they knew each other.  Several people who had known Hughes read the manuscript and felt convinced that it was genuinely his story.  However, he finally surfaced from his retreat on Paradise Island in the Bahamas to renounce the book. Hughes claimed that he had never met Clifford Irving and that the whole thing was a fake.  He added that he did not know where Irving had gotten his information.  However, he was not willing to make his renunciation in person.  He agreed only to do this by phone.  That meant that he could be identified only by his voice—how it sounded and what he said.

A group of reporters familiar with him from his early days was assembled by NBC in Los Angeles to ask him questions for two hours.  Their purpose was to authenticate the voice on the phone as that of the famous, eccentric billionaire, and they were to ask some key questions that would trip up an imposter.  The man on the phone responded in convincing detail.  He talked about such things as the make of his plane and trips that he had made, but he stumbled when asked about the good luck charm that a woman had presented to him before his 1938 trip around the world.  He said that he could not recall the incident, but moments later he did: She had placed chewing gum on the tail of his plane. This entire phone conversation was recorded and as they listened again, the reporters all believed that Hughes had been the man on the phone.  That meant that Irving was a fraud.       

Irving defended himself by insisting that the person who had called was the imposter, but NBC had hired Lawrence Kersta to make a voiceprint analysis.  He measured pitch, tone, and volume to compare the voice pictures on a line-by-line basis, comparing a recording of a speech that Hughes had made in 1947 with the recordings from the interview.  Finally he announced that the man who had spoken to reporters was Howard Hughes. Even one of Kersta's most vocal critics, phonetics professor Peter Ladefoged, admitted that the recordings were remarkably identical. Irving was arrested and convicted of forgery.  He repaid the publisher and was sentenced to thirty months in prison. Since the recordings had been made nearly a quarter of a century apart and Hughes' voice had deepened, there had been concern that changes would make the reading impossible.  However, the spectrographic patterns proved to be impressively similar.  This result further convinced Kersta that the inherent uniqueness of an individual's voice remains constant. AN


Arsonist jailed for murder yesterday has become the first person to be convicted on the evidence of a “voice identification parade”. Assad Khan’s voice was picked out by the murder victim’s tenant, who overheard him plotting to use petrol to set fire to the house. Although Raymond Sarong did not see the conversation taking place, he recognised Khan’s voice as he had seen him around the Hounslow area of West London and had heard him talking. Khan was jailed for life at the Old Bailey for murder. Khan, 21, was paid £10 for the petrol and £200 to light the blaze that killed Harbans Johal. Driven by jealousy and revenge, Mrs Johal’s former boyfriend, Didar Bains, had paid him to get back at her. The court was told that Mr Sarong had alerted police to what he had heard and the police, in turn, instructed the emergency services to treat all calls from her house as urgent. However, when Khan struck later that night, Mrs Johal failed to escape. After the murder, Detective Sergeant John McFarlane, who was investigating the case, came up with the idea of an auditory identification parade to pinpoint Khan as the killer. The technique, never used before, involved instructing a linguistics expert from Cambridge University to compile a selection of random pieces of conversations from different people. Khan’s voice was then added to the selection of tapes and Mr Sarong was asked if he could identify the voice of the man he heard agreeing to carry out the arson attack. Sarong’s positive identification ensured that both Khan and Bains of Heston, West London, were convicted of murder. Jailing the pair for life, Judge Giles Forrester said: “This was a wicked and cowardly act for which you were entirely responsible. “You came together for your own reasons to act in this way which has caused the death of a fellow human being.” (U.K. News TIMES, Hellen Studd)

The voice identification  team uses the lab for quantification of the voice signal in terms of pitch, loudness, quality, as well as for measurement of the acoustic parameters and breathing dynamics.                                                                        (1) Sound Spectrograph: The sound spectrograph produces a "print" that shows fundamental frequency, harmonics, and intensity of the voice. This is used to identify subtle but crucial changes in voice quality that can then be treated. The spectrogram identifies the presence of noise in the voice, such as tremor, pitch breaks, and abrupt starts.                                                                                                          (2) Aero-Dynamic Analysis: This technology measures laryngeal airflow, air pressure, and intensity during voice production. This allows for the analysis of breath use, laryngeal control, and vocal efficiency during phonation. Aerodynamic assessment reflects the physiology of vocal fold opening and closing.                                                                                                                                   (3) Voice Range Profile: Measures the high and low frequencies and intensities of the singing voice. This allows one to identify and quantify areas of the singer's range that are problematic. The voice range profile can then be used to monitor improvement following treatment of the singer's ability to control vocal intensity throughout his/her vocal range.                                                                           (4) Acoustic Analysis: Measures frequency of phonation as well as the variation of the voice. The variation of the voice signal associated with opening and closing of the vocal folds is related to the perception of hoarseness, harshness, and breathiness.                                                                                            (5) Feedback: All the systems used to assess vocal function must also be used to monitor therapy and provide an "acoustic mirror," which is, feedback to the patient as he/she talks and sings.


The Indian Evidence Act, prior to its being amended by the Information Technology Act, 2000, mainly dealt with evidence, which was in oral or documentary form. Nothing was there to point out about the admissibility, nature and evidentiary value of a conversation or statement recorded in an electro-magnetic device. Being confronted with the question of this nature and called upon to decide the same, the law courts in India as well as in England devised and developed principles so that such evidence, must be received in law courts and acted upon.

In India at Chandigarh Forensic Science laboratories voice identification techniques are regularly conducted and the Supreme Court has held that voice identification data is admissible in court.

In India at Bangalore, SRC Institute of Speech and Hearing has the facilities for voice analysis.

The All India Institute of Speech and Hearing, Mysore, which has been working in the field for many years now, even wants to start a one-year PG Diploma course in forensic voice analysis.

Case Law- Whetehr tape recorded conversation a form of voice identification is admissible in evidence?

In Hopes v. H.M. Advocate, 1960 Scots Law Times 264, the court while dealing with the question of admissibility of tape recorded conversation observed as under:

New techniques and new devises are the order of the day. I can’t conceive, for example, of the evidence of a ship’s captain as to what he observed being turned down as inadmissible because he had used a telescope, any more than the evidence of what an ordinary person sees with his eyes becomes incompetent because he was wearing spectacles. Of course, comments and criticism can be made, and no doubt will be made, on the audibility or the intelligibility, or perhaps the interpretation, of the results of the use of a scientific method; but that is another matter and that is a matter and that is a matter of value, not of competency.

In Rex v. Maqsud, 1965(2) All ER,461 wherein the Court of Criminal Appeal observed that the time has come when this court should state its views of the law matter which is likely to be increasingly raised as time passes. For many years now photographs have been admissible in evidence on proof that they are relevant to the issues in involved in the case and that the print as seen represents situations that have been reproduced by means of mechanical and chemical devices. Evidence of things seen through telescopes or binoculars which otherwise could not be picked up by the naked eye have been admitted, and now there are devices for picking up, transmitting and recording conversations. In principle no difference can be made between a tape recording and a photograph. The court was of the view that it would wrong to deny to the law of evidence advantages to be gained by new techniques and devises.

In India, the earliest case in which issue of admissibility of tape-recorded conversation came for consideration is Rupchand v. Mahabir Prasad, AIR 1956 Punjab 173. The court in this case though declined to treat tape-recorded conversation as writing within the meaning of section 3 (65) of the General Clauses Act but allowed the same to be used under section 155(3) of the Evidence Act as previous statement to shake the credit of witness. The Court held there is no rule of evidence, which prevents a party, who is endeavoring to shake the credit of a witness by use of former inconsistent statement, from deposing that while he was engaged in conversation with the witness, a tape recorder was in operation, or from producing the said tape recorder in support of the assertion that a certain statement was made in his presence.

In S. Pratap Singh v. State of Punjab, AIR 1964 SC 72 a five judges bench of Apex Court considered the issue and clearly propounded that tape recorded that tape recorded talks are admissible in evidence and simple fact that such type of evidence can be easily tampered which certainly could not be a ground to reject such evidence as inadmissible or refuse to consider it, because there are few documents and possibly no piece of evidence, which could not be tempered with.

The Apex Court in Yusufalli Esmail Nagree v. State of Maharashtra, AIR 1968 SC147 considered various aspects of the issue relating to admissibility of tape recoded conversation. This was a case relating to an offence under section 165-A of Indian Penal Code (New Section. 12 of Prevention of corruption Act, 1988 for punishment for abetment of offences of public servant taking gratification other than legal remuneration (S.7) and public servant obtaining valuable thing with out consideration from person concerned in proceeding or business transacted by such public servant (S.11)) and at the instance of the Investigating Agency, the conversation between accused, who wanted to bribe, and complainant was tape recorded. The prosecution wanted to use this tape recorded conversation as evidence against accused and it was argued that the same is hit by section 162 CrPC as well as article 20(3) of the constitution. In this landmark decision, the court emphatically laid down in unequivocal terms that the process of tape recording offers an accurate method of storing and later reproducing sounds. The imprint on the magnetic tape is direct effect of the relevant sounds. Like a photograph of a relevant incident, a contemporaneous tape record of a relevant conversation is a relevant fact and is admissible under section 7 of the Indian Evidence Act. The Apex Court after examining the entire issue in the light of various pronouncements laid down the following principles:
          a) The contemporaneous dialogue, which was tape recorded, formed part of res-gestae and is relevant and admissible under section 8 of the Indian Evidence Act.                                                                                                                   b) The contemporaneous tape record of a relevant conversation is a relevant fact and is admissible under section 7 of the Indian Evidence Act.                           c) Such a statement was not in fact a statement made to police during investigation and, therefore, cannot be held to be inadmissible under section 162 of the Criminal Procedure Code.                                                                                           d) Such a recorded conversation though procured without the knowledge of the accused but the same is not elicited by duress, coercion or compulsion nor extracted in an oppressive manner or by force or against the wishes of the accused. Therefore the protection of the article 20(3) was not available.                            e) One of the features of magnetic tape recording is the ability to erase and re-use the recording medium. Therefore, the evidence must be received with caution. The court must be satisfied beyond reasonable doubt that the record has not been tampered with.

In  in Rakesh Bisht V. CBI  Justice Badur Durrez Ahmed of High Court of Delhi at New Delhi on 3.1.2007 referred the case Central Bureau  of  Investigation v. Abdul Karim Ladsab Telgi and Others: 2005 CRI. L.J. 2868.  and in particular, he referred to paragraphs 11 and 12 to indicate that in similar circumstances,  the  Bombay  High  Court  had  directed  the  taking  of  voice samples.  He also referred to certain English decisions, namely, R v Robson R v Harris: 1972 [2] All ER 699 and R v Stevenson R v Hulse R v Whitney: 1971 [1] All ER 678. Court referred  the  Constitution  Bench  judgment  in  the State of Bombay v. Kathi Kalu Oghad: AIR 1961 SC 1808 to indicate that the taking of voice samples would definitely not infringe the provisions of Article 20 (3) of the Constitution of India.   The decision of the Supreme Court in  the case of R. M. Malkani v. State of Maharashtra: 1973 SCC (Cri) 399 was also referred for the proposition  that  the  tape  recorded  conversation  is  admissible  in  evidence provided the conversation is relevant to the matters in issue.  This  decision  supports that  tape  recorded  conversations  were  both relevant and admissible  for  the purposes of ascertaining  the authenticity of the  tape  recorded conversation already  recorded.    It would be necessary  to take  the voice  samples of  the petitioners  for  the purposes of  identification. 

The phenomenon of tendering tape recorded conversation before law courts as evidence, particularly in cases arising under the Prevention of Corruption Act, where such conversation is recorded by sending the complainant with a recording device to the person demanding or offering bribe has almost become a common practice now. In that cases the court has to face various questions regarding admissibility, nature and evidentiary value of such a tape- recorded conversation.

If in a particular case, there is a well grounded suspicion not even say proof, that the tape recording has been tampered with that would be a good ground for the court to discount wholly its  evidentiary value as in Pratap Singh v. State of Punjab, AIR 1964 SC 72. in the case of Ram Singh v. Col. Ram Singh, AIR 1986 SC 3, following conditions were pointed out by the Apex Court for admissibility of tape recorded conversation:
1. The voice of the speaker must be duly identified by the maker of the record or by others who recognize his voice. Where the maker has denied the voice it will require very strict proof to determine whether or not it was really the voice of the speaker.
2. The accuracy of the tape recorded statement has to be proved by the maker of the record by satisfactory evidence direct or circumstantial.
3. Every possibility of tempering with or erasure of a part of a tape recorded statement must be ruled out otherwise it may render the said statement out of context and, therefore, inadmissible.
d) The statement must be relevant according to the rules of Evidence Act.
4. The recorded cassette must be carefully sealed and kept in safe or official custody.
5. The voice of the speaker should be clearly audible and not lost or distorted by other sounds or disturbance.

In Ziyauddin Burhanuddin Bukhari v. Brijmohan Ramdas Mehta, AIR 1975 SC 1788, the Apex Court considered the value and use of transcripts and expressed the view that transcript could be used to show what the transcriber has found recorded there at the time of transcription and the evidence of the makers of the transcripts is certainly corroborative because it goes to confirm what the tape record contained. The Apex Court also made it clear that such transcripts can be used by a witness to refresh his memory under section 159 of the Evidence Act and their contents can be brought on record by direct oral evidence in the manner prescribed by section 160 of Evidence Act.

Apex Court in Ziyauddin Burhanuddin Bukhari case clearly laid down that the tape recorded speeches were "documents as defined by section 3 of the Evidence Act", which stood on no different footing than photographs.

The concept of evidence stands totally reformed after the coming into force of the Information Technology Act, 2000 on 17.10.2000. Section 2(r) of this Act is relevant in this respect which defines information in electronic form as information generated, sent, received or stored in media, magnetic, optical, computer memory, micro film, computer generated micro fiche or similar device. Under section 2 (t)‘electronic record ’ means data, record or data generated, image or sound stored, received or sent in an electronic form or micro film or computer generated micro fiche. Section 92 of this Act read with Schedule (2) amends the definition of ‘evidence’ as contained in section 3 of the Indian Evidence Act. The amended definition runs as under:
“Evidence:- ‘Evidence’ means and includes-
(1) all statements which the court permits or requires to be made before it by witness, in relation to matters of fact under inquiry;
such statement is called oral evidence;
(2) all documents including electronic records produced for the inspection of the Court; such documents are called documentary evidence.

The present legal position is recognizes the information stored on magnetic or electronic device and treats it as documentary evidence within the meaning of section 3 of the Indian Evidence Act.

At this juncture 3 legal points are necessarily to be answered for a better understanding.

1. Whether an information/evidence stored on magnetic or   electronic devise is primary or secondary?
2. Whether such evidence is direct or hearsay?                      3. Whether such evidence is corroborative or substantive?

The point whether such evidence is primary and direct was dealt with by the Apex Court in N. Sri Rama Reddy v. V.V. Giri, AIR 1971 SC 1162. the court held that like any document the tape record itself was primary and direct evidence admissible of what has been said and picked up by the receiver. This view was reiterated by the Apex Court in R.K. Malkani v. State of Maharashtra, AIR 1973 SC 157. in this case the court ordained that when a court permits a tape recording to be played over it is acting on real evidence if it treats the intonation of the words to be relevant and genuine. In Rama Reddy’s case (Supra), a three judges bench of the Apex Court in the case of Ziyauddin Burhanuddin Bukhari v. Brijmohan Ramdas Mehta, AIR 1975 SC 1788 held  that the use of tape recorded conversation was not confined to purpose of corroboration and contradiction only, but when duly proved by satisfactory evidence of what was found recorded and of absence of tampering, it could, it could subject to the provisions of the Evidence Act, be used as substantive evidence.  Giving an example, the Court pointed out that when it was disputed or in issue whether a person’s speech on a particular occasion, contained a particular statement there could be no more direct or better evidence of it than its tape recorded, assuming its authenticity to be duly established.From the aforesaid it can well be gathered as a settled legal proposition that evidence of tape recorded conversation being primary and direct one it can well be used to establish what was said by a person at a particular occasion.

In  N. Sri Rama Reddy (Supra) the Apex Court further held that the evidence of the tape recorded conversation/statement apart from being used for corroboration is admissible for the purposes stated in Section 146 (1), Exception (2) to section 153 and section 155 (3) of the Evidence Act.

The technique of voice identification by means of aural and spectrographic comparison is still an unsettled topic in law. Although the spectrographic voice identification method has progressed greatly since it was first introduced to a court of law back in the mid 1960's, it still faces stiff resistance on the issue of admissibility in the courts today.

Adv. K.C. Suresh, B.A., LL.M (Crimes), PGDHR (Human Rights)

Legal Adviser (Rtd) Vigilance & Anti-Corruption bureau, State of Kerala and State Special Public Prosecutor

"Loved reading this piece by K.C.Suresh?
Join LAWyersClubIndia's network for daily News Updates, Judgment Summaries, Articles, Forum Threads, Online Law Courses, and MUCH MORE!!"

Tags :

Category Criminal Law, Other Articles by - K.C.Suresh