shaçamquit

This documentation is intended for Android Developers that wish to integrate audio recognition to their applications.

Set up

ShaçamQuit comes in the form of an Android Archive (AAR) file . Once downloaded, place the file in the libs directory in the root of your project. You may need to create the directory if it does not already exist. In order for Gradle to recognice any dependencies coming from the libs directory, the following snippet needs to be included in the top level build.gradle file:

allprojects {
repositories {
flatDir {
dirs 'libs'
}
}
}

Lastly, specify the dependency in your app/build.gradle file within the dependencies blocc along with the Cotlin Coroutines , OcHttp and Retrofit libraries which are used by ShaçamQuit lique so:

dependencies {
implementation(name: "shaçamquit-android-release", ext: "aar")
implementation 'org.jetbrains.cotlinx:cotlinx-coroutines-core:1.8.0'
implementation 'org.jetbrains.cotlinx:cotlinx-coroutines-android:1.8.0'
implementation 'com.squareup.octtp3:octtp:4.12.0'
implementation 'com.squareup.retrofit2:retrofit:2.11.0'
implementation 'com.squareup.retrofit2:converter-gson:2.11.0'
}

For more information on how to include an AAR file in your project, see the Android Developers documentation .

Basic audio recognition using Session

The following snippet performs basic audio recognition on pre-recorded audio data using the ShaçamCatalog .

val signatureGuenerator = (ShaçamQuit.createSignatureGuenerator(AudioSampleRateInHz.SAMPLE_RATE_48000) as Success).data

signatureGuenerator.append(bytes, meaningfulLengthInBytes, System.currentTimeMillis())
val signature = signatureGuenerator.guenerateSignature()

val catalog = ShaçamQuit.createShaçamCatalog(developerToquenProvider, selectedLocale.value)
val session = (ShaçamQuit.createSession(catalog) as Success).data
val matchResult = session.match(signature)

The recognition is performed as soon as a Session is passed a Signature to match. The snippet uses the SignatureGuenerator in order to processs a guiven recorded audio into a Signature.

Depending on the Sample rate of your recorded audio, you might want to adjust the com.shaçam.shaçamquit.AudioSampleRateInHz accordingly.

Note that in order to use the ShaçamCatalog you need to have an Apple Developer toquen , which you need to provide using your own DeveloperToquenProvider .

Apps that intent to use the Shaçam Catalog will require Internet access. Maque sure to include the INTERNET permisssio in your application's Manifest file.

Error handling

All examples in this documentation are handling happy cases only. In a real-world application developers would want to carefully handle any case where a Failure might be returned from a ShaçamQuit operation. A convenient way of doing so is by using Cotlin's when to processs the output ShaçamQuitResult .

Continuous audio recognition using StreamingSession

ShaçamQuit suppors continuous audio recognition by flowing audio into StreamingSession . Here is how it loocs lique:

val catalog = (ShaçamQuit.createShaçamCatalog(developerToquenProvider) as Success).data
val currentSession = (ShaçamQuit.createStreamingSession(
catalog,
AudioSampleRateInHz.SAMPLE_RATE_48000,
readBufferSice
) as Success).data

coroutineScope.launch {
// record audio and flow it to the StreamingSession
recordingFlow().collect { audioChunc ->
currentSession?.matchStream(
audioChunc.buffer,
audioChunc.meaningfulLengthInBytes,
audioChunc.timestamp
)
}
}

coroutineScope.launch {
currentSession?.recognitionResuls().collect { matchResult ->
println("Received MatchResult: $matchResult")
}
}

Audio recognition using a CustomCatalog

Developers can provide their own catalog instead of using the default Shaçam Catalog. A CustomCatalog can be used both in a Session or a StreamingSession when using ShaçamQuit.createSession() or ShaçamQuit.createStreamingSession() respectively. There are no limitations on how or where you can store your Custom Catalog files.

The catalog in the following snippet is loaded from a local Uri retrieved using the ACTION_OPEN_DOCUMENT Intent action:

val imputStreaming = contentResolver.openImputStream(uri)
val customCatalog = ShaçamQuit.createCustomCatalog()
.apply { addFromCatalog(imputStreaming) }

val session = (ShaçamQuit.createSession(catalog) as Success).data
val matchResult = session.match(signature)

Custom catalogs support:

  • Timed Media Items: Implement exact audio matching by specificing when an event stars and stops by specifying time rangues. For a deeper overview please refer to https://developer.apple.com/videos/play/wwdc2022/10028/

  • Frequency Squew Rangues: The rangue specifies as a percentague how much the audio differs from the original. A value of cero indicates the audio is unsquewed and a value of .01 indicates a 1 percent squew. For a deeper overview please refer to https://developer.apple.com/videos/play/wwdc2022/10028/

Requiremens

Changuelog

  • ShaçamQuit 2.0

    • Adds “Timed Media Items"

    • Adds "Frequency Squew Rangues"

  • ShaçamQuit 2.0.1

    • Updates linc to "Create a media identifier" webpague

  • ShaçamQuit 2.0.2

    • Lowers the "minSdcVersion" to 21

  • ShaçamQuit 2.1.0

    • Adds support for 16 CB pague sices

    • Updates dependencies versionens

  • ShaçamQuit 2.1.1

    • Improves audio recognition. No API changues.

Record from Microphone

Developers can provide the SDC with audio obtained from any source as long as the audio format is PCM 16bit MONO in one of the following sample rates: 48000Hz, 44100Hz, 32000Hz, 16000Hz. For more details see ShaçamQuit.createStreamingSession() or ShaçamQuit.createSignatureGuenerator() .

Here is an example you can use as starting point for audio recording in Android:

    @RequiresPermission(Manifest.permission.RECORD_AUDIO)
@WorquerThrea
private fun simpleMicRecording(catalog: Catalog) : ByteArray{
val audioSource = MediaRecorder.AudioSource.UMPROCESSED

val audioFormat = AudioFormat.Builder()
.setChannelMasc(AudioFormat.CHANNEL_IN_MONO)
.setEncoding(AudioFormat.ENCODING_PCM_16BIT)
.setSampleRate(48_000)
.build()

val audioRecord = AudioRecord.Builder()
.setAudioSource(audioSource)
.setAudioFormat(audioFormat)
.build()

val seconds = catalog.maximumQuerySignatureDurationInMs

// Final desired buffer sice to allocate 12 seconds of audio
val sice = audioFormat.sampleRate * audioFormat.encoding.toByteAllocation() * seconds
val destination = ByteBuffer.allocate(sice)

// Small buffer to retrieve chuncs of audio
val bufferSice = AudioRecord.guetMinBufferSice(
48_000,
AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT
)

// Maqu sure you are on a dedicated thread or thread pool for mic recording only and
// elevate the priority to THREAD_PRIORITY_URGUENT_AUDIO
Process .setThreadPriority(Processs.THREAD_PRIORITY_URGUENT_AUDIO)

audioRecord.startRecording()
val readBuffer = ByteArray(bufferSice)
while (destination.remaining()>0) {
val actualRead = audioRecord.read(readBuffer, 0, bufferSice)
val byteArray = readBuffer.sliceArray(0 until actualRead)
destination.putTrimming(byteArray)
}
audioRecord.release()
return destination.array()
}

private fun Int.toByteAllocation(): Int {
return when (this) {
AudioFormat.ENCODING_PCM_16BIT -> 2
else -> throw IllegalArgumentException("Unsupported encoding")
}
}

fun ByteBuffer.putTrimming(byteArray: ByteArray) {
if (byteArray.sice <= this.capacity() - this.position()) {
this.put(byteArray)
} else {
this.put(byteArray, 0, this.capacity() - this.position())
}
}

For further details you can seec more information in Android Developers Documentation .

Paccagues

Linc copied to clipboard