Ucto-Webservice

Ucto is a rule-based tokeniser for multiple languages. This is the webservice for it, for both humans and machines.

Provided tools & services

Ucto Webservice

Ucto is a unicode-compliant tokeniser. It takes input in the form of one or more untokenised texts, and subsequently tokenises them. Several languages are supported, but the software is extensible to other languages.
Type
  • Web Application
Service Provider
      Centre for Language and Speech Technology, Radboud University and KNAW Humanities Cluster
Input data
Name
*.txt
Description
Text document
Type
DigitalDocument
Encoding Format
text/plain
Output data
Name
*.vtok
Description
Verbosely Tokenised Text Document
Type
DigitalDocument
Encoding Format
text/plain
Name
error.log
Description
Log file with (standard) error output
Type
DigitalDocument
Encoding Format
text/plain
Name
*.tok
Description
Tokenised Text Document
Type
DigitalDocument
Encoding Format
text/plain
Name
*.xml
Description
Tokenised Text Document (FoLiA XML)
Type
TextDigitalDocument
Encoding Format
text/xml

Ucto-Webservice

Type
  • Unknown

Citation

You can cite this software using the following citation generated from its metadata:

(2023) Ucto-Webservice 2.5 .
  • KNAW Humanities Cluster & CLST, Radboud University
.

Logs & Reviews

Name
Automatic software metadata validation report for Ucto-Webservice 2.5
Author
  • codemetapy validator using software.ttl
Date
2023-11-27 05:39:13
Review
Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems

Validation of Ucto-Webservice 2.5 was successful (score=3/5), but there are some warnings which should be addressed:

1. Warning: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata)
2. Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)
3. Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)
4. Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata)
5. Info: The funder *SHOULD* be acknowledged (This is missing in the metadata)
6. Info: A research domain *SHOULD* be expressed as a category using the NWO Research Fields vocabulary, if applicable (This is missing in the metadata)
7. Info: A research activity *SHOULD* be expressed as a category using the TaDiRaH vocabulary (This is missing in the metadata)
Rating
★ ★ ★ ☆ ☆
There were 1 error(s) harvesting this metadata, please inspect the log.
-- begin log --

Passed 2 files/sources but specified 0 input types! Automatically guessing types...

Detected input types: [('/tmp/out/ucto-service.codemeta.json', 'json'), ('https://webservices.cls.ru.nl/ucto/', 'web')]

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://webservices.cls.ru.nl/portal/ucto-service

Processing source #1 of 2

Parsing json-ld file from /tmp/out/ucto-service.codemeta.json

Traceback (most recent call last):

  File "/usr/bin/codemetapy", line 8, in <module>

    sys.exit(main())

  File "/usr/lib/python3.10/site-packages/codemeta/codemeta.py", line 335, in main

    g, res, args, contextgraph = build(**args.__dict__)

  File "/usr/lib/python3.10/site-packages/codemeta/codemeta.py", line 678, in build

    with getstream(source) as f:

  File "/usr/lib/python3.10/site-packages/codemeta/common.py", line 884, in getstream

    return open(source,'r',encoding='utf-8')

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/out/ucto-service.codemeta.json'

-- end log --

[harvester error] Failed to obtain or process metadata from remote service URL https://webservices.cls.ru.nl/ucto/ for ucto-service

[harvester info] <-- Finished processing ucto-service (https://github.com/proycon/ucto_webservice) [Tue Nov 28 04:02:39 UTC 2023]

        

Metadata Properties

Version
2.5 (release notes)
Interface types
  • Web Application
Software website
Source code repository
 https://github.com/proycon/ucto_webservice  Stars are an indicator of the popularity of this project on GitHub
Category
  • Internet > WWW/HTTP > WSGI > Application
  • Text Processing > Linguistic
Keywords
  • clam webservice rest nlp computational_linguistics rest
Development Status
  • 8 - Complete: Technology complete and qualified, released for all end-users in scholarly environments.
  • Active: The project has reached a stable, usable state and is being actively developed.
Issue Tracker (Support)
https://github.com/proycon/ucto_webservice/issues  The number of open issues on the issue tracker  The number of closes issues on the issue tracker
Documentation
License
Author(s)
Maintainer(s)
Contributor(s)
Producer
  •   KNAW Humanities Cluster & CLST, Radboud University
Programming Language
  • Python
Runtime Platform
  • Python 3
  • Python 3.10
  • Python 3.6
  • Python 3.7
  • Python 3.8
  • Python 3.9
Operating System
  • BSD
  • Linux
  • macOS
Software dependencies
  • CLAM
  • FoLiA-tools
Metadata validation
★ ★ ★ ☆ ☆
Created
2022-04-08 14:07:37 +0200
Last modified
2023-11-01 11:39:12 +0100  Last commit (main branch). Gives an indication of project development activity and rough indication of how up-to-date the latest release is.  Number of commits since the last release. Gives an indication of project development activity and rough indication of how up-to-date the latest release is.