* common : implement parser combinators to simplify chat parsing * add virtual destructor to parser_base * fix memory leak from circular references of rules * implement gbnf grammar building * remove unused private variable * create a base visitor and implement id assignment as a visitor * fix const ref for grammar builder * clean up types, friend classes, and class declarations * remove builder usage from until_parser * Use a counter class to help assign rule ids * cache everything * add short description for each parser * create a type for the root parser * implement repetition parser * Make optional, one_or_more, and zero_or_more subclasses of repetition * improve context constructor * improve until parsing and add benchmarks * remove cached() pattern, cache in parser_base with specialized parsing functions for each parser * improve json parsing performance to better match legacy parsing * fix const auto * it for windows * move id assignment to classes instead of using a visitor * create named rules in the command r7b example * use '.' for any in GBNF * fix parens around choices in gbnf grammar * add convenience operators to turn strings to literals * add free-form operators for const char * to simplify defining literals * simplify test case parser * implement semantic actions * remove groups in favor of actions and a scratchpad * add built in actions for common operations * add actions to command r7b example * use std::default_searcher for platforms that don't have bm * improve parser_type handling and add cast helper * add partial result type to better control when to run actions * fix bug in until() * run actions on partial results by default * use common_chat_msg for result * add qwen3 example wip * trash partial idea and simplify * move action arguments to a struct * implement aho-corasick matcher for until_parser and to build exclusion grammars * use std::string for input, since std::string_view is incompatible with std::regex * Refactor tests * improve qwen3 example * implement sax-style parsing and refactor * fix json string in test * rename classes to use common_chat_ prefix * remove is_ suffix from functions * rename from id_counter to just counter * Final refactored tests * Fix executable name and editorconfig-checker * Third time's the charm... * add trigger parser to begin lazy grammar rule generation * working lazy grammar * refactor json rules now that we check for reachability * reduce pointer usage * print out grammars in example * rename to chat-peg-parser* and common_chat_peg_parser* * Revert unrelated changes * New macros for CMakeLists to enable multi-file compilations * starting unicode support * add unicode support to char_parser * use unparsed args as additional sources * Refactor tests to new harness * Fix CMakeLists * fix rate calculation * add unicode tests * fix trailing whitespace and line endings skip-checks: true * Helpers + rewrite qwen3 with helpers * Fix whitespace * extract unicode functions to separate file * refactor parse unicode function * fix compiler error * improve construction of sequence/choice parsers * be less clever * add make_parser helper function * expand usage of make_parser, alias common_chat_msg_peg_parser_builder to builder in source * lower bench iterations * add unicode support to until_parser * add unicode support to json_string_parser * clean up unicode tests * reduce unicode details to match src/unicode.cpp * simplify even further * remove unused functions * fix type * reformat char class parsing * clean up json string parser * clean up + fix diagnostics * reorder includes * compact builder functions * replace action_parser with capture_parser, rename env to semantics * rename env to semantics * clean up common_chat_parse_context * move type() to below constant * use default constructor for common_chat_peg_parser * make all operators functions for consistency * fix compilation errors in test-optional.cpp * simplify result values * rename json_string_unquoted to json_string_content * Move helper to separate class, add separate explicit and helper classes * Whitespace * Change + to append() * Reformat * Add extra helpers, tests and Minimax example * Add some extra optional debugging prints + real example of how to use them * fix bug in repetitions when min_count = 0 reports failures * dump rule in debug * fix token accumulation and assert parsing never fails * indent debug by depth * use LOG_* in tests so logs sync up with test logs * - Add selective testing - Refactor all messaging to use LOG_ERR - Fix lack of argument / tool name capturing - Temporary fix for double event capture * refactor rule() and introduce ref() * clean up visitor * clean up indirection in root parser w.r.t rules * store shared ptr directly in parser classes * replace aho-corasick automation with a simple trie * Reset prev for qwen3 helper example variant * refactor to use value semantics with std::variant/std::visit * simplify trie_matcher result * fix linting issues * add annotations to rules * revert test workaround * implement serializing the parser * remove redundant parsers * remove tests * gbnf generation fixes * remove LOG_* use in tests * update gbnf tests to test entire grammar * clean up gbnf generation and fix a few bugs * fix typo in test output * remove implicit conversion rules * improve test output * rename trie_matcher to trie * simplify trie to just know if a node is the end of a word * remove common_chat_ prefix and ensure a common_peg_ prefix to all types * rename chat-peg-parser -> peg-parser * promote chat-peg-parser-helper to chat-peg-parser * checkpoint * use a static_assert to ensure we handle every branch * inline trivial peg parser builders * use json strings for now * implement basic and native chat peg parser builders/extractors * resolve refs to their rules * remove packrat caching (for now) * update tests * compare parsers with incremental input * benchmark both complete and incremental parsing * add raw string generation from json schema * add support for string schemas in gbnf generation * fix qwen example to include \n * tidy up example * rename extractor to mapper * rename ast_arena to ast * place basic tests into one * use gbnf_format_literal from json-schema-to-grammar * integrate parser with common/chat and server * clean up schema and serialization * add json-schema raw string tests * clean up json creation and remove capture parser * trim spaces from reasoning and content * clean up redundant rules and comments * rename input_is_complete to is_partial to match rest of project * simplify json rules * remove extraneous file * remove comment * implement += and |= operators * add comments to qwen3 implementation * reorder arguments to common_chat_peg_parse * remove commented outdated tests * add explicit copy constructor * fix operators and constness * wip: update test-chat for qwen3-coder * bring json parser closer to json-schema-to-grammar rules * trim trailing space for most things * fix qwen3 coder rules w.r.t. trailing spaces * group rules * do not trim trailing space from string args * tweak spacing of qwen3 grammar * update qwen3-coder tests * qwen3-coder small fixes * place parser in common_chat_syntax to simplify invocation * use std::set to collect rules to keep order predictable for tests * initialize parser to make certain platforms happy * revert back to std::unordered_set, sort rule names at the end instead * uncomment rest of chat tests * define explicit default constructor * improve arena init and server integration * fix chat test * add json_member() * add a comprehensive native example * clean up example qwen test and add response_format example to native test * make build_peg_parser accept std::function instead of template * change peg parser parameters into const ref * push tool call on tool open for constructed parser * add parsing documentation * clean up some comments * add json schema support to qwen3-coder * add id initializer in tests * remove grammar debug line from qwen3-coder * refactor qwen3-coder to use sequence over operators * only call common_chat_peg_parse if appropriate format * simplify qwen3-coder space handling * revert qwen3-coder implementation * revert json-schema-to-grammar changes * remove unnecessary forward declaration * small adjustment to until_parser * rename C/C++ files to use dashes * codeowners : add aldehir to peg-parser and related files --------- Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com>
244 lines
10 KiB
CMake
244 lines
10 KiB
CMake
llama_add_compile_flags()
|
|
|
|
function(llama_build source)
|
|
set(TEST_SOURCES ${source} ${ARGN})
|
|
|
|
if (DEFINED LLAMA_TEST_NAME)
|
|
set(TEST_TARGET ${LLAMA_TEST_NAME})
|
|
else()
|
|
get_filename_component(TEST_TARGET ${source} NAME_WE)
|
|
endif()
|
|
|
|
add_executable(${TEST_TARGET} ${TEST_SOURCES})
|
|
target_link_libraries(${TEST_TARGET} PRIVATE common)
|
|
install(TARGETS ${TEST_TARGET} RUNTIME)
|
|
endfunction()
|
|
|
|
function(llama_test target)
|
|
include(CMakeParseArguments)
|
|
set(options)
|
|
set(oneValueArgs NAME LABEL WORKING_DIRECTORY)
|
|
set(multiValueArgs ARGS)
|
|
cmake_parse_arguments(LLAMA_TEST "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
|
|
|
|
if (NOT DEFINED LLAMA_TEST_LABEL)
|
|
set(LLAMA_TEST_LABEL "main")
|
|
endif()
|
|
if (NOT DEFINED LLAMA_TEST_WORKING_DIRECTORY)
|
|
set(LLAMA_TEST_WORKING_DIRECTORY .)
|
|
endif()
|
|
if (DEFINED LLAMA_TEST_NAME)
|
|
set(TEST_NAME ${LLAMA_TEST_NAME})
|
|
else()
|
|
set(TEST_NAME ${target})
|
|
endif()
|
|
|
|
set(TEST_TARGET ${target})
|
|
|
|
add_test(
|
|
NAME ${TEST_NAME}
|
|
WORKING_DIRECTORY ${LLAMA_TEST_WORKING_DIRECTORY}
|
|
COMMAND $<TARGET_FILE:${TEST_TARGET}>
|
|
${LLAMA_TEST_ARGS})
|
|
|
|
set_property(TEST ${TEST_NAME} PROPERTY LABELS ${LLAMA_TEST_LABEL})
|
|
endfunction()
|
|
|
|
function(llama_test_cmd target)
|
|
include(CMakeParseArguments)
|
|
set(options)
|
|
set(oneValueArgs NAME LABEL WORKING_DIRECTORY)
|
|
set(multiValueArgs ARGS)
|
|
cmake_parse_arguments(LLAMA_TEST "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
|
|
|
|
if (NOT DEFINED LLAMA_TEST_LABEL)
|
|
set(LLAMA_TEST_LABEL "main")
|
|
endif()
|
|
if (NOT DEFINED LLAMA_TEST_WORKING_DIRECTORY)
|
|
set(LLAMA_TEST_WORKING_DIRECTORY .)
|
|
endif()
|
|
if (DEFINED LLAMA_TEST_NAME)
|
|
set(TEST_NAME ${LLAMA_TEST_NAME})
|
|
else()
|
|
set(TEST_NAME ${target})
|
|
endif()
|
|
|
|
add_test(
|
|
NAME ${TEST_NAME}
|
|
WORKING_DIRECTORY ${LLAMA_TEST_WORKING_DIRECTORY}
|
|
COMMAND ${target}
|
|
${LLAMA_TEST_ARGS})
|
|
|
|
set_property(TEST ${TEST_NAME} PROPERTY LABELS ${LLAMA_TEST_LABEL})
|
|
endfunction()
|
|
|
|
# Builds and runs a test source file.
|
|
# Optional args:
|
|
# - NAME: name of the executable & test target (defaults to the source file name without extension)
|
|
# - LABEL: label for the test (defaults to main)
|
|
# - ARGS: arguments to pass to the test executable
|
|
# - WORKING_DIRECTORY
|
|
function(llama_build_and_test source)
|
|
include(CMakeParseArguments)
|
|
set(options)
|
|
set(oneValueArgs NAME LABEL WORKING_DIRECTORY)
|
|
set(multiValueArgs ARGS)
|
|
cmake_parse_arguments(LLAMA_TEST "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
|
|
|
|
set(TEST_SOURCES ${source} ${LLAMA_TEST_UNPARSED_ARGUMENTS} get-model.cpp)
|
|
|
|
if (NOT DEFINED LLAMA_TEST_LABEL)
|
|
set(LLAMA_TEST_LABEL "main")
|
|
endif()
|
|
if (NOT DEFINED LLAMA_TEST_WORKING_DIRECTORY)
|
|
set(LLAMA_TEST_WORKING_DIRECTORY .)
|
|
endif()
|
|
if (DEFINED LLAMA_TEST_NAME)
|
|
set(TEST_TARGET ${LLAMA_TEST_NAME})
|
|
else()
|
|
get_filename_component(TEST_TARGET ${source} NAME_WE)
|
|
endif()
|
|
|
|
add_executable(${TEST_TARGET} ${TEST_SOURCES})
|
|
install(TARGETS ${TEST_TARGET} RUNTIME)
|
|
target_link_libraries(${TEST_TARGET} PRIVATE common)
|
|
|
|
add_test(
|
|
NAME ${TEST_TARGET}
|
|
WORKING_DIRECTORY ${LLAMA_TEST_WORKING_DIRECTORY}
|
|
COMMAND $<TARGET_FILE:${TEST_TARGET}>
|
|
${LLAMA_TEST_ARGS})
|
|
|
|
set_property(TEST ${TEST_TARGET} PROPERTY LABELS ${LLAMA_TEST_LABEL})
|
|
endfunction()
|
|
|
|
# build test-tokenizer-0 target once and add many tests
|
|
llama_build(test-tokenizer-0.cpp)
|
|
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-bert-bge ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-bert-bge.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-command-r ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-command-r.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-deepseek-coder ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-deepseek-coder.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-deepseek-llm ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-deepseek-llm.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-falcon ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-falcon.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-gpt-2 ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-gpt-2.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-llama-bpe ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-llama-bpe.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-llama-spm ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-llama-spm.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-mpt ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-mpt.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-phi-3 ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-phi-3.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-qwen2 ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-qwen2.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-refact ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-refact.gguf)
|
|
llama_test(test-tokenizer-0 NAME test-tokenizer-0-starcoder ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-starcoder.gguf)
|
|
|
|
if (NOT WIN32)
|
|
llama_test_cmd(
|
|
${CMAKE_CURRENT_SOURCE_DIR}/test-tokenizers-repo.sh
|
|
NAME test-tokenizers-ggml-vocabs
|
|
WORKING_DIRECTORY ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}
|
|
ARGS https://huggingface.co/ggml-org/vocabs ${PROJECT_SOURCE_DIR}/models/ggml-vocabs
|
|
)
|
|
endif()
|
|
|
|
if (LLAMA_LLGUIDANCE)
|
|
llama_build_and_test(test-grammar-llguidance.cpp ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-llama-bpe.gguf)
|
|
endif ()
|
|
|
|
if (NOT WIN32 OR NOT BUILD_SHARED_LIBS)
|
|
# these tests are disabled on Windows because they use internal functions not exported with LLAMA_API (when building with shared libraries)
|
|
llama_build_and_test(test-sampling.cpp)
|
|
llama_build_and_test(test-grammar-parser.cpp)
|
|
llama_build_and_test(test-grammar-integration.cpp)
|
|
llama_build_and_test(test-llama-grammar.cpp)
|
|
llama_build_and_test(test-chat.cpp)
|
|
# TODO: disabled on loongarch64 because the ggml-ci node lacks Python 3.8
|
|
if (NOT ${CMAKE_SYSTEM_PROCESSOR} MATCHES "loongarch64")
|
|
llama_build_and_test(test-json-schema-to-grammar.cpp WORKING_DIRECTORY ${PROJECT_SOURCE_DIR})
|
|
target_include_directories(test-json-schema-to-grammar PRIVATE ${PROJECT_SOURCE_DIR}/tools/server)
|
|
endif()
|
|
|
|
if (NOT GGML_BACKEND_DL)
|
|
llama_build(test-quantize-stats.cpp)
|
|
endif()
|
|
|
|
llama_build(test-gbnf-validator.cpp)
|
|
|
|
# build test-tokenizer-1-bpe target once and add many tests
|
|
llama_build(test-tokenizer-1-bpe.cpp)
|
|
|
|
# TODO: disabled due to slowness
|
|
#llama_test(test-tokenizer-1-bpe NAME test-tokenizer-1-aquila ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-aquila.gguf)
|
|
#llama_test(test-tokenizer-1-bpe NAME test-tokenizer-1-falcon ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-falcon.gguf)
|
|
#llama_test(test-tokenizer-1-bpe NAME test-tokenizer-1-gpt-2 ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-gpt-2.gguf)
|
|
#llama_test(test-tokenizer-1-bpe NAME test-tokenizer-1-gpt-neox ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-gpt-neox.gguf)
|
|
#llama_test(test-tokenizer-1-bpe NAME test-tokenizer-1-llama-bpe ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-llama-bpe.gguf --ignore-merges)
|
|
#llama_test(test-tokenizer-1-bpe NAME test-tokenizer-1-mpt ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-mpt.gguf)
|
|
#llama_test(test-tokenizer-1-bpe NAME test-tokenizer-1-refact ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-refact.gguf)
|
|
#llama_test(test-tokenizer-1-bpe NAME test-tokenizer-1-starcoder ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-starcoder.gguf)
|
|
|
|
# build test-tokenizer-1-spm target once and add many tests
|
|
llama_build(test-tokenizer-1-spm.cpp)
|
|
|
|
llama_test(test-tokenizer-1-spm NAME test-tokenizer-1-llama-spm ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-llama-spm.gguf)
|
|
#llama_test(test-tokenizer-1-spm NAME test-tokenizer-1-baichuan ARGS ${PROJECT_SOURCE_DIR}/models/ggml-vocab-baichuan.gguf)
|
|
|
|
# llama_build_and_test(test-double-float.cpp) # SLOW
|
|
endif()
|
|
|
|
llama_build_and_test(test-chat-parser.cpp)
|
|
llama_build_and_test(test-chat-peg-parser.cpp peg-parser/simple-tokenize.cpp)
|
|
llama_build_and_test(test-chat-template.cpp)
|
|
llama_build_and_test(test-json-partial.cpp)
|
|
llama_build_and_test(test-log.cpp)
|
|
llama_build_and_test(
|
|
test-peg-parser.cpp
|
|
peg-parser/simple-tokenize.cpp
|
|
peg-parser/test-basic.cpp
|
|
peg-parser/test-gbnf-generation.cpp
|
|
peg-parser/test-json-parser.cpp
|
|
peg-parser/test-json-serialization.cpp
|
|
peg-parser/test-unicode.cpp
|
|
peg-parser/testing.h
|
|
peg-parser/tests.h
|
|
)
|
|
llama_build_and_test(test-regex-partial.cpp)
|
|
|
|
if (NOT ${CMAKE_SYSTEM_PROCESSOR} MATCHES "s390x")
|
|
llama_build_and_test(test-thread-safety.cpp ARGS -hf ggml-org/models -hff tinyllamas/stories15M-q4_0.gguf -ngl 99 -p "The meaning of life is" -n 128 -c 256 -ub 32 -np 4 -t 2)
|
|
else()
|
|
llama_build_and_test(test-thread-safety.cpp ARGS -hf ggml-org/models -hff tinyllamas/stories15M-be.Q4_0.gguf -ngl 99 -p "The meaning of life is" -n 128 -c 256 -ub 32 -np 4 -t 2)
|
|
endif()
|
|
|
|
# this fails on windows (github hosted runner) due to curl DLL not found (exit code 0xc0000135)
|
|
if (NOT WIN32)
|
|
llama_build_and_test(test-arg-parser.cpp)
|
|
endif()
|
|
|
|
if (NOT LLAMA_SANITIZE_ADDRESS AND NOT GGML_SCHED_NO_REALLOC)
|
|
# TODO: repair known memory leaks
|
|
llama_build_and_test(test-opt.cpp)
|
|
endif()
|
|
llama_build_and_test(test-gguf.cpp)
|
|
llama_build_and_test(test-backend-ops.cpp)
|
|
|
|
llama_build_and_test(test-model-load-cancel.cpp LABEL "model")
|
|
llama_build_and_test(test-autorelease.cpp LABEL "model")
|
|
|
|
if (NOT GGML_BACKEND_DL)
|
|
# these tests use the backends directly and cannot be built with dynamic loading
|
|
llama_build_and_test(test-barrier.cpp)
|
|
llama_build_and_test(test-quantize-fns.cpp)
|
|
llama_build_and_test(test-quantize-perf.cpp)
|
|
llama_build_and_test(test-rope.cpp)
|
|
endif()
|
|
|
|
# libmtmd
|
|
set(LLAMA_TEST_NAME test-mtmd-c-api)
|
|
llama_build_and_test(test-mtmd-c-api.c)
|
|
target_link_libraries(${LLAMA_TEST_NAME} PRIVATE mtmd)
|
|
|
|
# dummy executable - not installed
|
|
get_filename_component(TEST_TARGET test-c.c NAME_WE)
|
|
add_executable(${TEST_TARGET} test-c.c)
|
|
target_link_libraries(${TEST_TARGET} PRIVATE llama)
|
|
|
|
llama_build_and_test(test-alloc.cpp)
|
|
target_include_directories(test-alloc PRIVATE ${PROJECT_SOURCE_DIR}/ggml/src)
|