Tuesday, March 04, 2008

Capturing Audio with Java Sound API

This Tech Tip reprinted with permission by java.sun.com

The Java Sound API has been a part of the standard libraries of the Java 2 Platform since the 1.3 release. Found in the javax.sound.sampled package, the Java Sound API provides support for playing and capturing audio. In addition, the library offers a software-based audio mixer and MIDI (Musical Instrument Digital Interface) device access. In this tip, you'll learn how to capture audio through the Java Sound API and play it back.

The javax.sound.sampled package consists of eight interfaces, twelve top-level classes, twelve inner classes, and two exceptions. To record and play audio, you only need to deal with a total of seven parts of the package.

Let's examine recording first. The basic recording process is as follows:

Describe the audio format in which you want to record the data. This includes specifying the sampling rate and the number of channels (mono versus stereo) for the audio. You specify these properties using the aptly named AudioFormat class. There are two constructors for creating an AudioFormat object:

AudioFormat(AudioFormat.Encoding encoding,
float sampleRate, int sampleSizeInBits,
int channels, int frameSize, float frameRate,
boolean bigEndian)

AudioFormat(float sampleRate, int sampleSizeInBits,
int channels, boolean signed, boolean bigEndian)

The first constructor lets you explicitly set the audio format encoding, while the latter uses a default. The available encodings are ALAW, PCM_SIGNED, PCM_UNSIGNED, and ULAW. The default encoding used for the second constructor is PCM. Here is an example that uses the second constructor to create an AudioFormat object for single channel recording in 8 kHz format:

float sampleRate = 8000;
int sampleSizeInBits = 8;
int channels = 1;
boolean signed = true;
boolean bigEndian = true;
AudioFormat format = new AudioFormat(sampleRate,
sampleSizeInBits, channels, signed, bigEndian);

After you describe the audio format, you need to get a DataLine. This interface represents an audio feed from which you can capture the audio. You use a subinterface of DataLine to do the actual capturing. The subinterface is called TargetDataLine. To get the TargetDataLine, you ask the AudioSystem. However when you do that, you need to specify information about the line. You make the specification in the form of a DataLine.Info object. In particular, you need to create a DataLine.Info object that is specific to the DataLine type and audio format. Here are some lines of source that get the TargetDataLine.

DataLine.Info info = new DataLine.Info(
TargetDataLine.class, format);
TargetDataLine line = (TargetDataLine)
AudioSystem.getLine(info);

If the TargetDataLine is unavailable, a LineUnavailableException is thrown.

At this point you have your input source. You can think of the TargetDataLine like an input stream. However, it requires some setup before you can read form it. Setup in this case means first opening the line using the open() method, and then initializing the line using the start() method:

line.open(format);
line.start();

Your data line is ready, so you can start recording from it as shown in the following lines of code. Here you save a captured audio stream to a byte array for later playing. You could also save the audio stream to a file. Notice that you have to manage when to stop outside the read-loop construct.

int bufferSize = (int)format.getSampleRate() *
format.getFrameSize();
byte buffer[] = new byte[bufferSize];
out = new ByteArrayOutputStream();
while (externalTrigger) {
int count = line.read(buffer, 0, buffer.length);
if (count > 0) {
out.write(buffer, 0, count);
}
}
out.close();

Now let's examine playing audio. There are two key differences in playing audio as compared to recording audio. First, when you play audio, the bytes come from an AudioInputStream instead of a TargetDataLine. Second, you write to a SourceDataLine instead of into a ByteArrayOutputStream. Besides that, the process is the same.

To get the AudioInputStream, you need to convert the ByteArrayOutputStream into the source of the AudioInputStream. The AudioInputStream constructor requires the bytes from the output stream, the audio format encoding used, and the number of sample frames:

byte audio[] = out.toByteArray();
InputStream input = new ByteArrayInputStream(audio);
AudioInputStream ais = new AudioInputStream(input,
format, audio.length / format.getFrameSize());

Getting the DataLine is similar to the way you get it for audio recording, but for playing audio, you need to fetch a SourceDataLine instead of a TargetDataLine:

DataLine.Info info = new DataLine.Info(
SourceDataLine.class, format);
SourceDataLine line =
(SourceDataLine)AudioSystem.getLine(info);

Setup for the line is identical to the setup for audio recording:

line.open(format);
line.start();

The last step is to play the audio as shown below. Notice that this step is similar to the last step in recording. However, here you read from the buffer and write to the data line. There is also an added drain operation that works like a flush on an output stream.

int bufferSize = (int) format.getSampleRate()
* format.getFrameSize();
byte buffer[] = new byte[bufferSize];

int count;
while ((count =
ais.read(buffer, 0, buffer.length)) != -1) {
if (count > 0) {
line.write(buffer, 0, count);
}
}
line.drain();
line.close();

The following program puts these steps together to demonstrate using the Java Sound the API to record and play audio. The program also presents a GUI. Press the Capture button to start recording the audio, the Stop button to stop recording, and the Play button to play back the audio.

Note: Depending on the audio support your platform provides, you might need to change the format returned by the getFormat method in the program. If you don't know what format is supported by your platform, download the demo program from the Java Sound Demo page, and run it. Click on the CapturePlayback tab, and find a set of format settings that work for you. You can load an audio file from the audio subdirectory, and then try various settings on the left until something works. Use those settings in the creation of the AudioFormat returned by getFormat().

import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
import java.io.*;
import javax.sound.sampled.*;

public class Capture extends JFrame {

protected boolean running;
ByteArrayOutputStream out;

public Capture() {
super("Capture Sound Demo");
setDefaultCloseOperation(EXIT_ON_CLOSE);
Container content = getContentPane();

final JButton capture = new JButton("Capture");
final JButton stop = new JButton("Stop");
final JButton play = new JButton("Play");

capture.setEnabled(true);
stop.setEnabled(false);
play.setEnabled(false);

ActionListener captureListener =
new ActionListener() {
public void actionPerformed(ActionEvent e) {
capture.setEnabled(false);
stop.setEnabled(true);
play.setEnabled(false);
captureAudio();
}
};
capture.addActionListener(captureListener);
content.add(capture, BorderLayout.NORTH);

ActionListener stopListener =
new ActionListener() {
public void actionPerformed(ActionEvent e) {
capture.setEnabled(true);
stop.setEnabled(false);
play.setEnabled(true);
running = false;
}
};
stop.addActionListener(stopListener);
content.add(stop, BorderLayout.CENTER);

ActionListener playListener =
new ActionListener() {
public void actionPerformed(ActionEvent e) {
playAudio();
}
};
play.addActionListener(playListener);
content.add(play, BorderLayout.SOUTH);
}

private void captureAudio() {
try {
final AudioFormat format = getFormat();
DataLine.Info info = new DataLine.Info(
TargetDataLine.class, format);
final TargetDataLine line = (TargetDataLine)
AudioSystem.getLine(info);
line.open(format);
line.start();
Runnable runner = new Runnable() {
int bufferSize = (int)format.getSampleRate()
* format.getFrameSize();
byte buffer[] = new byte[bufferSize];

public void run() {
out = new ByteArrayOutputStream();
running = true;
try {
while (running) {
int count =
line.read(buffer, 0, buffer.length);
if (count > 0) {
out.write(buffer, 0, count);
}
}
out.close();
} catch (IOException e) {
System.err.println("I/O problems: " + e);
System.exit(-1);
}
}
};
Thread captureThread = new Thread(runner);
captureThread.start();
} catch (LineUnavailableException e) {
System.err.println("Line unavailable: " + e);
System.exit(-2);
}
}

private void playAudio() {
try {
byte audio[] = out.toByteArray();
InputStream input =
new ByteArrayInputStream(audio);
final AudioFormat format = getFormat();
final AudioInputStream ais =
new AudioInputStream(input, format,
audio.length / format.getFrameSize());
DataLine.Info info = new DataLine.Info(
SourceDataLine.class, format);
final SourceDataLine line = (SourceDataLine)
AudioSystem.getLine(info);
line.open(format);
line.start();

Runnable runner = new Runnable() {
int bufferSize = (int) format.getSampleRate()
* format.getFrameSize();
byte buffer[] = new byte[bufferSize];

public void run() {
try {
int count;
while ((count = ais.read(
buffer, 0, buffer.length)) != -1) {
if (count > 0) {
line.write(buffer, 0, count);
}
}
line.drain();
line.close();
} catch (IOException e) {
System.err.println("I/O problems: " + e);
System.exit(-3);
}
}
};
Thread playThread = new Thread(runner);
playThread.start();
} catch (LineUnavailableException e) {
System.err.println("Line unavailable: " + e);
System.exit(-4);
}
}

private AudioFormat getFormat() {
float sampleRate = 8000;
int sampleSizeInBits = 8;
int channels = 1;
boolean signed = true;
boolean bigEndian = true;
return new AudioFormat(sampleRate,
sampleSizeInBits, channels, signed, bigEndian);
}

public static void main(String args[]) {
JFrame frame = new Capture();
frame.pack();
frame.show();
}
}

You can find a more complete example of using the Java Sound API at the Java Sound Demo page. That example also shows an oscilloscope of the sound wave as it is playing back.

Copyright (c) 2004-2005 Sun Microsystems, Inc.
All Rights Reserved.

No comments: