Automatic Speech Recognition for Dutch

This is a web-based automatic speech recogniser for Dutch, capable of transcribing dutch speech recordings using multiple models.

Provided tools & services

Automatic Speech Recognition for Dutch

Type
  • Unknown

Automatic Transcription of Dutch Speech Recordings

This webservice uses automatic speech recognition to provide the transcriptions of recordings spoken in Dutch. You can upload and process only one file per project. For bulk processing and other questions, please contact Henk van den Heuvel at h.vandenheuvel@let.ru.nl.
Type
  • Web Application
Version
0.6.1
Note: Version does not match latest source release (0.6.2), service may be out of date
Service Provider
      Centre for Language and Speech Technology, Radboud University
Input data
Name
*.ogg
Description
Ogg file
Type
AudioObject
Encoding Format
audio/vorbis
Name
*.flac
Description
Flac file
Type
AudioObject
Encoding Format
audio/flac
Name
*.wav
Description
Wav file
Type
AudioObject
Encoding Format
audio/vnd.wave
Name
*.mp4
Description
MP4 file
Type
AudioObject
Encoding Format
audio/mpeg
Name
*.m4a
Description
M4A file
Type
AudioObject
Encoding Format
audio/mpeg
Name
*.mp3
Description
MP3 file
Type
AudioObject
Encoding Format
audio/mpeg
Output data
Name
*.txt
Description
Automatic transcription of the input recording
Type
DigitalDocument
Encoding Format
text/plain
Name
*.xml
Description
Automatic transcription of the input recording (full data) (AudioDoc XML)
Type
DigitalDocument
Encoding Format
text/xml
Name
error.log
Description
Log file with (standard) error output
Type
DigitalDocument
Encoding Format
text/plain
Name
*.ctm.spk
Description
Automatic transcription of the input recording with timestamps (CTM) and speaker diarisation
Type
DigitalDocument
Encoding Format
text/plain
Name
*.ctm
Description
Automatic transcription of the input recording with timestamps (CTM)
Type
DigitalDocument
Encoding Format
text/plain

Citation

You can cite this software using the following citation generated from its metadata:

Logs & Reviews

Name
Automatic software metadata validation report for Automatic Speech Recognition for Dutch 0.6.2
Author
  • codemetapy validator using software.ttl
Date
2024-06-24 04:47:41
Review
Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems

Validation of Automatic Speech Recognition for Dutch 0.6.2 was successful (score=3/5), but there are some warnings which should be addressed:

1. Warning: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata)
2. Info: Software source code *MAY* express the programming language(s) used (This is missing in the metadata)
3. Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)
4. Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)
5. Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata)
6. Info: The funder *SHOULD* be acknowledged (This is missing in the metadata)
Rating
★ ★ ★ ☆ ☆
(log file starts at Mon Jun 24 04:47:38 UTC 2024)

[harvester info] --> Processing asr_nl (https://github.com/opensource-spraakherkenning-nl/asr_nl) [Mon Jun 24 04:47:38 UTC 2024]

[harvester info] Git updating cached clone of https://github.com/opensource-spraakherkenning-nl/asr_nl...

[harvester info] Found release v0.6.2

[harvester info] Using 'v0.6.2'

[harvester info] Git reference: v0.6.2

[harvester info] Scanning directory /tmp/codemeta-harvester.cache/asr_nl for harvestable resources...

[harvester info] found codemeta.json for asr_nl (md5sum c38d31855921f1b956862aa9c5b2bed7); **NOTE: this is considered authoritative and most other detection methods will be skipped now!**

[harvester info] Inferring repostatus information from git activity (used only as a fallback if not explicitly provided)...

[harvester info] Inferred repostatus https://www.repostatus.org/#active

[harvester info] Looking for repostatus information in README.md in master branch...

-- begin log --

-- end log --

[harvester info] Found README.md

[harvester info] Reconciliating: codemetapy  --baseuri https://webservices.cls.ru.nl/portal --baseuri https://webservices.cls.ru.nl/portal --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl --identifier "asr_nl" --codeRepository "https://github.com/opensource-spraakherkenning-nl/asr_nl" --validate /etc/software.ttl --released --enrich --textv "Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems" -O /tmp/out/asr_nl.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-repostatus.asr_nl.codemeta.json /tmp/codemeta-harvester.cache//tmp/41-readme.asr_nl.codemeta.json /tmp/codemeta-harvester.cache//tmp/10-jsonld.asr_nl.codemeta.json 

-- begin log --

Passed 3 files/sources but specified 0 input types! Automatically guessing types...

Detected input types: [('/tmp/codemeta-harvester.cache//tmp/99-repostatus.asr_nl.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/41-readme.asr_nl.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/10-jsonld.asr_nl.codemeta.json', 'json')]

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://webservices.cls.ru.nl/portal/asr-nl

Processing source #1 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-repostatus.asr_nl.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asr-nl

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asr-nl)] processed 1 new triples, total is now 2

Processing source #2 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/41-readme.asr_nl.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asr-nl

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asr-nl)] processed 1 new triples, total is now 3

Processing source #3 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/10-jsonld.asr_nl.codemeta.json

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asr-nl

[CODEMETA COMPOSITION (asr_nl)] overriding old https://codemeta.github.io/terms/readme (https://github.com/proycon/alpino_clam_webservice/blob/v0.6.2/README.md -> https://github.com/opensource-spraakherkenning-nl/asr_nl/blob/master/README.md)

[CODEMETA CORRECTION (asr_nl)] automatically converting spdx license URI from https:// to http:///

[CODEMETA COMPOSITION (asr_nl)] processed 70 new triples, total is now 70

Remapping URI to (possibly) new identifier and version component: https://webservices.cls.ru.nl/portal/asr-nl -> https://webservices.cls.ru.nl/portal/asr_nl/0.6.2

[CODEMETA VALIDATION (asr_nl)] done

[CODEMETA ENRICHMENT (asr_nl)] Guessing interface type https://w3id.org/software-types#WebApplication based on clues

[CODEMETA ENRICHMENT (asr_nl)] adding author https://webservices.cls.ru.nl/portal/stub/H-3b715d3958f63d27 as contributor

[CODEMETA ENRICHMENT (asr_nl)] adding author https://webservices.cls.ru.nl/portal/stub/H-61650b4af1ddca21 as contributor

[CODEMETA ENRICHMENT (asr_nl)] adding author https://webservices.cls.ru.nl/portal/stub/H-2165619b1d845a98 as contributor

VALIDATION https://webservices.cls.ru.nl/portal/asr_nl/0.6.2 #1: Warning: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asr_nl/0.6.2 #2: Info: Software source code *MAY* express the programming language(s) used (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asr_nl/0.6.2 #3: Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)

VALIDATION https://webservices.cls.ru.nl/portal/asr_nl/0.6.2 #4: Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asr_nl/0.6.2 #5: Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asr_nl/0.6.2 #6: Info: The funder *SHOULD* be acknowledged (This is missing in the metadata)

-- end log --

[harvester info] Output written to /tmp/out/asr_nl.codemeta.json

[harvester info] Harvesting remote service URL https://webservices.cls.ru.nl/asr_nl/ for asr_nl: codemetapy  --baseuri https://webservices.cls.ru.nl/portal --baseuri https://webservices.cls.ru.nl/portal --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl -O "/tmp/codemeta-harvester.cache//tmp/asr_nl.codemeta.json" "/tmp/out/asr_nl.codemeta.json" "https://webservices.cls.ru.nl/asr_nl/"

-- begin log --

Passed 2 files/sources but specified 0 input types! Automatically guessing types...

Detected input types: [('/tmp/out/asr_nl.codemeta.json', 'json'), ('https://webservices.cls.ru.nl/asr_nl/', 'web')]

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://webservices.cls.ru.nl/portal/asr-nl

Processing source #1 of 2

Parsing json-ld file from /tmp/out/asr_nl.codemeta.json

    Found main resource with URI https://webservices.cls.ru.nl/portal/asr_nl/0.6.2

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asr-nl

[CODEMETA COMPOSITION (asr_nl)] processed 109 new triples, total is now 109

Processing source #2 of 2

Fallback: Obtaining metadata from remote URL https://webservices.cls.ru.nl/asr_nl/

    Service replied with content-type application/ld+json

    Parsing json...

    Found main resource with URI https://webservices.cls.ru.nl/asr_nl

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/webapplication/N4164af2b65bc1949ea9507a77dfe099d

Adding service (targetProduct) https://webservices.cls.ru.nl/asr_nl/

[CODEMETA COMPOSITION (asr_nl)] processed 80 new triples, total is now 190

Remapping URI to (possibly) new identifier and version component: https://webservices.cls.ru.nl/portal/asr-nl -> https://webservices.cls.ru.nl/portal/asr_nl/0.6.2

[CODEMETA VALIDATION (asr_nl)] done

-- end log --

[harvester info] <-- Finished processing asr_nl (https://github.com/opensource-spraakherkenning-nl/asr_nl) [Mon Jun 24 04:47:46 UTC 2024]

        

Metadata Properties

Version
0.6.2
Interface types
  • Web Application
Software website
Source code repository
 https://github.com/opensource-spraakherkenning-nl/asr_nl  Stars are an indicator of the popularity of this project on GitHub
Category
  • Software for humanities
  • Speech Recognizing
Keywords
  • dutch
  • nlp
  • speech recognition
Development Status
  • Active: The project has reached a stable, usable state and is being actively developed.
  • 9 - Proven: Technology complete and proven in practice by real users.
Issue Tracker (Support)
https://github.com/opensource-spraakherkenning-nl/asr_nl/issues  The number of open issues on the issue tracker  The number of closes issues on the issue tracker
Documentation
License
Author(s)
Maintainer(s)
Contributor(s)
Producer
Operating System
  • Linux
Software dependencies
  • CLAM
  • kaldi
  • Kaldi_NL
Metadata validation
★ ★ ★ ☆ ☆
Created
2017-04-02