asrservice

An Automatic Speech Recognition Service for a variety of languages, powered by WhisperX

Provided tools & services

asrservice

Type
  • Unknown

Automatic Speech Recognition Service

Type
  • Web Application
Version
0.3
Input data
Name
*.ogg
Description
Ogg audio file
Type
AudioObject
Encoding Format
audio/vorbis
Name
*.wav
Description
Wav audio file
Type
AudioObject
Encoding Format
audio/vnd.wave
Name
*.mp4
Description
MP4 audio file
Type
AudioObject
Encoding Format
audio/mpeg
Name
*.mp3
Description
MP3 audio file
Type
AudioObject
Encoding Format
audio/mpeg
Output data
Name
*.tsv
Description
Timed transcriptions with speaker attribution (TSV)
Type
Dataset
Encoding Format
text/tab-separated-values
Name
*.srt
Description
Timed transcriptions with speaker attribution (srt)
Type
TextDigitalDocument
Encoding Format
application/x-subrip
Name
*.vtt
Description
Timed transcriptions with speaker attribution (WebVTT)
Type
TextDigitalDocument
Encoding Format
text/vtt
Name
*.ctm
Description
Transcription with full word segmentation/alignment
Type
DigitalDocument
Encoding Format
text/plain
Name
error.log
Description
Log file with (standard) error output
Type
DigitalDocument
Encoding Format
text/plain
Name
*.txt
Description
Plain text transcriptions without time stamps and speaker attribution
Type
DigitalDocument
Encoding Format
text/plain
Name
*.ctm
Description
Transcription with full word segmentation/alignment and speaker attribution
Type
DigitalDocument
Encoding Format
text/plain
Name
*.json
Description
Transcription with full word segmentation/alignment and speaker attribution
Type
DigitalDocument
Encoding Format
application/json

Citation

You can cite this software using the following citation generated from its metadata:

Logs & Reviews

Name
Automatic software metadata validation report for asrservice 0.3
Author
  • codemetapy validator using software.ttl
Date
2024-09-17 04:01:18
Review
Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems

Validation of asrservice 0.3 was successful (score=3/5), but there are some warnings which should be addressed:

1. Warning: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata)
2. Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)
3. Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)
4. Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata)
5. Info: The funder *SHOULD* be acknowledged (This is missing in the metadata)
6. Info: A research domain *SHOULD* be expressed as a category using the NWO Research Fields vocabulary, if applicable (This is missing in the metadata)
7. Info: A research activity *SHOULD* be expressed as a category using the TaDiRaH vocabulary (This is missing in the metadata)
Rating
★ ★ ★ ☆ ☆
(log file starts at Tue Sep 17 04:01:04 UTC 2024)

[harvester info] --> Processing asrservice (https://github.com/opensource-spraakherkenning-nl/asrservice) [Tue Sep 17 04:01:04 UTC 2024]

[harvester info] Git updating cached clone of https://github.com/opensource-spraakherkenning-nl/asrservice...

[harvester info] Found release v0.3

[harvester info] Using 'v0.3'

[harvester info] Git reference: v0.3

[harvester info] Scanning directory /tmp/codemeta-harvester.cache/asrservice for harvestable resources...

[harvester info] found python setup for asrservice, converting to codemeta

-- begin log --

No input files specified, but found python project (setup.py) in current dir, using that...

Generating egg_info

running egg_info

writing asrservice.egg-info/PKG-INFO

writing dependency_links to asrservice.egg-info/dependency_links.txt

writing requirements to asrservice.egg-info/requires.txt

writing top-level names to asrservice.egg-info/top_level.txt

reading manifest file 'asrservice.egg-info/SOURCES.txt'

reading manifest template 'MANIFEST.in'

adding license file 'LICENSE'

writing manifest file 'asrservice.egg-info/SOURCES.txt'

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://webservices.cls.ru.nl/portal/asrservice

Processing source #1 of 1

Obtaining python package metadata for: asrservice

Loading metadata from asrservice via importlib.metadata

WARNING: No translation for distutils or pyproject.toml key Metadata-Version

WARNING: No translation for distutils or pyproject.toml key License-File

WARNING: No translation for distutils or pyproject.toml key Description

Found dependency CLAM >= 3.2

Found dependency whisperx 

[CODEMETA COMPOSITION (asrservice)] processed 46 new triples, total is now 47

Remapping URI to (possibly) new identifier and version component: https://webservices.cls.ru.nl/portal/asrservice -> https://webservices.cls.ru.nl/portal/asrservice/0.3

[CODEMETA VALIDATION (asrservice)] done

-- end log --

[harvester info] Looking for license....

[harvester info] Found license AGPL-3.0-only

-- begin log --

Trying README.md ...

Trying LICENSE ...

-- end log --

[harvester info] Getting contributors from git...

-- begin log --

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://webservices.cls.ru.nl/portal/asrservice-contributors

Processing source #1 of 1

Extracting contributors from /tmp/codemeta-harvester.cache//tmp/asrservice.CONTRIBUTORS

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice-contributors)] processed 13 new triples, total is now 14

Remapping URI to (possibly) new identifier and version component: https://webservices.cls.ru.nl/portal/asrservice-contributors -> https://webservices.cls.ru.nl/portal/asrservice.contributors/snapshot

[CODEMETA VALIDATION (https://webservices.cls.ru.nl/portal/asrservice.contributors/snapshot)] codeRepository not set

[CODEMETA VALIDATION (https://webservices.cls.ru.nl/portal/asrservice.contributors/snapshot)] author not set

[CODEMETA VALIDATION (https://webservices.cls.ru.nl/portal/asrservice.contributors/snapshot)] license not set

[CODEMETA VALIDATION (https://webservices.cls.ru.nl/portal/asrservice.contributors/snapshot)] done

-- end log --

[harvester info] Extracting last and first commit date from git log....

[harvester info] Date created: 2024-02-16T11:01:30Z+0100, date modified: 2024-04-12T10:39:45Z+0200

[harvester info] Querying Github/GitLab API (https://github.com/opensource-spraakherkenning-nl/asrservice)

-- begin log --

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://webservices.cls.ru.nl/portal/asrservice

Processing source #1 of 1

Querying GitAPI parser for https://github.com/opensource-spraakherkenning-nl/asrservice

    Parsing Github API response

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 12 new triples, total is now 13

Remapping URI to (possibly) new identifier and version component: https://webservices.cls.ru.nl/portal/asrservice -> https://webservices.cls.ru.nl/portal/asrservice/snapshot

[CODEMETA VALIDATION (https://webservices.cls.ru.nl/portal/asrservice/snapshot)] author not set

[CODEMETA VALIDATION (https://webservices.cls.ru.nl/portal/asrservice/snapshot)] done

Querying https://api.github.com/repos/opensource-spraakherkenning-nl/asrservice

Remaining github API requests: 4997 ### Next rate limit reset at: 2024-09-17 05:00:42 (has_token=True)

Querying https://api.github.com/users/opensource-spraakherkenning-nl

Remaining github API requests: 4996 ### Next rate limit reset at: 2024-09-17 05:00:42 (has_token=True)

-- end log --

[harvester info] Found buildInstructions in INSTALL

[harvester info] Found releaseNotes

[harvester info] Querying Zenodo API for DOI (access token provided)...

[harvester info] Looking for TRL information in README.md...

-- begin log --

-- end log --

[harvester info] Looking for repostatus information in README.md...

-- begin log --

-- end log --

[harvester info] Looking for continuous integration information in README.md...

-- begin log --

-- end log --

[harvester info] Looking for documentation links in README.md...

-- begin log --

-- end log --

[harvester info] Falling back to git tag (v0.3) if no version number is specified...

[harvester info] Inferring repostatus information from git activity (used only as a fallback if not explicitly provided)...

[harvester info] Inferred repostatus https://www.repostatus.org/#active

[harvester info] Looking for repostatus information in README.md in master branch...

-- begin log --

-- end log --

[harvester info] Found README.md

[harvester info] Reconciliating: codemetapy  --baseuri https://webservices.cls.ru.nl/portal --baseuri https://webservices.cls.ru.nl/portal --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl --identifier "asrservice" --codeRepository "https://github.com/opensource-spraakherkenning-nl/asrservice" --validate /etc/software.ttl --released --enrich --textv "Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems" -O /tmp/out/asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-version.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-repostatus.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/43-releasenotes.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/42-buildinstructions.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/41-readme.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/40-gitapi.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/39-gitdate.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/32-contributors.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/29-license.asrservice.codemeta.json /tmp/codemeta-harvester.cache//tmp/20-python.asrservice.codemeta.json 

-- begin log --

Passed 10 files/sources but specified 0 input types! Automatically guessing types...

Detected input types: [('/tmp/codemeta-harvester.cache//tmp/99-version.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/99-repostatus.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/43-releasenotes.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/42-buildinstructions.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/41-readme.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/40-gitapi.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/39-gitdate.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/32-contributors.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/29-license.asrservice.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/20-python.asrservice.codemeta.json', 'json')]

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://webservices.cls.ru.nl/portal/asrservice

Processing source #1 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-version.asrservice.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 1 new triples, total is now 2

Processing source #2 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-repostatus.asrservice.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 1 new triples, total is now 3

Processing source #3 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/43-releasenotes.asrservice.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 2 new triples, total is now 5

Processing source #4 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/42-buildinstructions.asrservice.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 1 new triples, total is now 6

Processing source #5 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/41-readme.asrservice.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 1 new triples, total is now 7

Processing source #6 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/40-gitapi.asrservice.codemeta.json

    Found main resource with URI https://webservices.cls.ru.nl/portal/asrservice/snapshot

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 14 new triples, total is now 20

Processing source #7 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/39-gitdate.asrservice.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] overriding old http://schema.org/dateCreated (2024-02-16T10:06:53Z -> 2024-02-16T11:01:30Z+0100)

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] overriding old http://schema.org/dateModified (2024-04-15T12:24:08Z -> 2024-04-12T10:39:45Z+0200)

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 2 new triples, total is now 20

Processing source #8 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/32-contributors.asrservice.codemeta.json

    Found main resource with URI https://webservices.cls.ru.nl/portal/asrservice.contributors/snapshot

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 14 new triples, total is now 33

Processing source #9 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/29-license.asrservice.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] overriding old http://schema.org/license (http://spdx.org/licenses/AGPL-3.0-only -> AGPL-3.0-only)

[CODEMETA CORRECTION (https://webservices.cls.ru.nl/portal/asrservice)] automatically converting license to spdx URI

[CODEMETA COMPOSITION (https://webservices.cls.ru.nl/portal/asrservice)] processed 1 new triples, total is now 33

Processing source #10 of 10

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/20-python.asrservice.codemeta.json

    Found main resource with URI https://webservices.cls.ru.nl/portal/asrservice/0.3

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (asrservice)] overriding old https://codemeta.github.io/terms/developmentStatus (https://www.repostatus.org/#active -> https://www.repostatus.org/#wip)

[CODEMETA COMPOSITION (asrservice)] overriding old http://schema.org/license (http://spdx.org/licenses/AGPL-3.0-only -> http://spdx.org/licenses/GPL-3.0-only)

[CODEMETA COMPOSITION (asrservice)] overriding old http://schema.org/version (v0.3 -> 0.3)

[CODEMETA COMPOSITION (asrservice)] processed 64 new triples, total is now 88

Remapping URI to (possibly) new identifier and version component: https://webservices.cls.ru.nl/portal/asrservice -> https://webservices.cls.ru.nl/portal/asrservice/0.3

[CODEMETA VALIDATION (asrservice)] done

[CODEMETA ENRICHMENT (asrservice)] Guessing interface type https://w3id.org/software-types#WebApplication based on clues

[CODEMETA ENRICHMENT (asrservice)] automatically adding programmingLanguage Python derived from runtimePlatform Python

[CODEMETA ENRICHMENT (asrservice)] automatically adding programmingLanguage Python derived from runtimePlatform Python

[CODEMETA ENRICHMENT (asrservice)] automatically adding programmingLanguage Python derived from runtimePlatform Python

[CODEMETA ENRICHMENT (asrservice)] automatically adding programmingLanguage Python derived from runtimePlatform Python

[CODEMETA ENRICHMENT (asrservice)] automatically adding programmingLanguage Python derived from runtimePlatform Python

[CODEMETA ENRICHMENT (asrservice)] automatically adding programmingLanguage Python derived from runtimePlatform Python

[CODEMETA ENRICHMENT (asrservice)] automatically adding programmingLanguage Python derived from runtimePlatform Python

[CODEMETA ENRICHMENT (asrservice)] considering first author as maintainer

VALIDATION https://webservices.cls.ru.nl/portal/asrservice/0.3 #1: Warning: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asrservice/0.3 #2: Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)

VALIDATION https://webservices.cls.ru.nl/portal/asrservice/0.3 #3: Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asrservice/0.3 #4: Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asrservice/0.3 #5: Info: The funder *SHOULD* be acknowledged (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asrservice/0.3 #6: Info: A research domain *SHOULD* be expressed as a category using the NWO Research Fields vocabulary, if applicable (This is missing in the metadata)

VALIDATION https://webservices.cls.ru.nl/portal/asrservice/0.3 #7: Info: A research activity *SHOULD* be expressed as a category using the TaDiRaH vocabulary (This is missing in the metadata)

-- end log --

[harvester info] Output written to /tmp/out/asrservice.codemeta.json

[harvester info] Harvesting remote service URL https://webservices2.cls.ru.nl/asrservice/ for asrservice: codemetapy  --baseuri https://webservices.cls.ru.nl/portal --baseuri https://webservices.cls.ru.nl/portal --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl -O "/tmp/codemeta-harvester.cache//tmp/asrservice.codemeta.json" "/tmp/out/asrservice.codemeta.json" "https://webservices2.cls.ru.nl/asrservice/"

-- begin log --

Passed 2 files/sources but specified 0 input types! Automatically guessing types...

Detected input types: [('/tmp/out/asrservice.codemeta.json', 'json'), ('https://webservices2.cls.ru.nl/asrservice/', 'web')]

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://webservices.cls.ru.nl/portal/asrservice

Processing source #1 of 2

Parsing json-ld file from /tmp/out/asrservice.codemeta.json

    Found main resource with URI https://webservices.cls.ru.nl/portal/asrservice/0.3

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/asrservice

[CODEMETA COMPOSITION (asrservice)] processed 104 new triples, total is now 104

Processing source #2 of 2

Fallback: Obtaining metadata from remote URL https://webservices2.cls.ru.nl/asrservice/

    Service replied with content-type application/ld+json

    Parsing json...

    Found main resource with URI https://webservices2.cls.ru.nl/asrservice

    Injected (possibly temporary) URI https://webservices.cls.ru.nl/portal/webapplication/Nfa962cd5f2ae77afacd0154a25fdd63c

Adding service (targetProduct) https://webservices2.cls.ru.nl/asrservice/

[CODEMETA COMPOSITION (asrservice)] processed 83 new triples, total is now 188

Remapping URI to (possibly) new identifier and version component: https://webservices.cls.ru.nl/portal/asrservice -> https://webservices.cls.ru.nl/portal/asrservice/0.3

[CODEMETA VALIDATION (asrservice)] done

-- end log --

[harvester info] <-- Finished processing asrservice (https://github.com/opensource-spraakherkenning-nl/asrservice) [Tue Sep 17 04:01:22 UTC 2024]

        

Metadata Properties

Version
0.3 (release notes)
Interface types
  • Web Application
Software website
Source code repository
 https://github.com/opensource-spraakherkenning-nl/asrservice  Stars are an indicator of the popularity of this project on GitHub
Category
  • Internet > WWW/HTTP > WSGI > Application
  • Text Processing > Linguistic
Keywords
  • clam webservice rest nlp computational_linguistics rest
Development Status
  • Experimental: The technology is implemented and ready for experimental settings (beta), but requires further work and validation.
  • WIP: Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.
Issue Tracker (Support)
https://github.com/opensource-spraakherkenning-nl/asrservice/issues  The number of open issues on the issue tracker  The number of closes issues on the issue tracker
Documentation
License
Author(s)
Maintainer(s)
Contributor(s)
Producer
Programming Language
  • Python
Runtime Platform
  • Python 3
  • Python 3.10
  • Python 3.11
  • Python 3.6
  • Python 3.7
  • Python 3.8
  • Python 3.9
Operating System
  • BSD
  • Linux
  • macOS
Software dependencies
  • CLAM
  • whisperx
Metadata validation
★ ★ ★ ☆ ☆
Created
2024-02-16 11:01:30 +0100
Last modified
2024-04-12 10:39:45 +0200  Last commit (main branch). Gives an indication of project development activity and rough indication of how up-to-date the latest release is.  Number of commits since the last release. Gives an indication of project development activity and rough indication of how up-to-date the latest release is.