conf.py: tweak SearchEnglish to be hyphen-friendly

This modifies the default indexer split() and js splitQuery() methods to support searching for words with hyphens. While this might not be an ideal, rock solid, and fully future-proof solution, it allows at least to search for strings inlcuding hyphens, such as 'bitbake-layers', 'send-error-report', or 'oe-core'. Below is a bit more detailed explanation of the two modifications done: 1) The default split regex in the sphinx-doc SearchLanguage base class is: | _word_re = re.compile(r'\w+') which we simply extend to include hyphens '-'. This will result in a searchindex.js that contains words with hyphens, too. 2) The 'searchtool.js' code notes for its splitQuery() implementation: | /** | * Default splitQuery function. Can be overridden in ``sphinx.search`` with a | * custom function per language. | * | * The regular expression works by splitting the string on consecutive characters | * that are not Unicode letters, numbers, underscores, or emoji characters. | * This is the same as ``\W+`` in Python, preserving the surrogate pair area. | */ | if (typeof splitQuery === "undefined") { | var splitQuery = (query) => query | .split(/[^\p{Letter}\p{Number}_\p{Emoji_Presentation}]+/gu) | .filter(term => term) // remove remaining empty strings | } The hook for this is documented in the sphinx-docs 'SearchLanguage' base class. | .. attribute:: js_splitter_code | | Return splitter function of JavaScript version. The function should be | named as ``splitQuery``. And it should take a string and return list of | strings. | | .. versionadded:: 3.0 We use this to define a simplified splitQuery() function with a split argument that splits on empty spaces only. We extend SearchEnglish (which extends SearchLanguage) here to retain the stemmer code and stopwords for English. [YOCTO #14534] (From yocto-docs rev: ce18901b1059746069a0dea8893ba4a357772b51) Signed-off-by: Enrico Jörns <ejo@pengutronix.de> Signed-off-by: Antonin Godard <antonin.godard@bootlin.com> (cherry picked from commit d4a98ee19e0cbd6be96923dc72faee143a6b294b) Signed-off-by: Antonin Godard <antonin.godard@bootlin.com> Signed-off-by: Steve Sakoman <steve@sakoman.com>
author: Enrico Jörns <ejo@pengutronix.de> 2025-05-20 11:45:14 +0200
committer: Steve Sakoman <steve@sakoman.com> 2025-05-27 09:38:57 -0700
commit: d5d8a11fc907e5f3b4954594ae487bd3586f52bb (patch)
tree: 08113ec6e0f9688fbdb73be39e53fd6d1b5c310d
parent: e159b7c2510fbf856c7b262e035d39e59f628a26 (diff)
download: poky-d5d8a11fc907e5f3b4954594ae487bd3586f52bb.tar.gz
1 files changed, 19 insertions, 0 deletions
diff --git a/documentation/conf.py b/documentation/conf.py
index 2aceeb8e79..ad60d91139 100644
--- a/documentation/conf.py
+++ b/documentation/conf.py
@@ -13,6 +13,7 @@
 # documentation root, use os.path.abspath to make it absolute, like shown here.
 #
 import os
+import re
 import sys
 import datetime
 try:
@@ -173,6 +174,24 @@ latex_elements = {
    'preamble': '\\usepackage[UTF8]{ctex}\n\\setcounter{tocdepth}{2}',
 }
+from sphinx.search import SearchEnglish
+from sphinx.search import languages
+class DashFriendlySearchEnglish(SearchEnglish):
+    # Accept words that can include hyphens
+    _word_re = re.compile(r'[\w\-]+')
+    js_splitter_code = """
+function splitQuery(query) {
+    return query
+        .split(/[^\p{Letter}\p{Number}_\p{Emoji_Presentation}-]+/gu)
+        .filter(term => term.length > 0);
+}
+"""
+languages['en'] = DashFriendlySearchEnglish
 # Make the EPUB builder prefer PNG to SVG because of issues rendering Inkscape SVG
 from sphinx.builders.epub3 import Epub3Builder
 Epub3Builder.supported_image_types = ['image/png', 'image/gif', 'image/jpeg']
author	Enrico Jörns <ejo@pengutronix.de>	2025-05-20 11:45:14 +0200
committer	Steve Sakoman <steve@sakoman.com>	2025-05-27 09:38:57 -0700
commit	d5d8a11fc907e5f3b4954594ae487bd3586f52bb (patch)
tree	08113ec6e0f9688fbdb73be39e53fd6d1b5c310d
parent	e159b7c2510fbf856c7b262e035d39e59f628a26 (diff)
download	poky-d5d8a11fc907e5f3b4954594ae487bd3586f52bb.tar.gz

diff --git a/documentation/conf.py b/documentation/conf.py index 2aceeb8e79..ad60d91139 100644 --- a/documentation/conf.py +++ b/documentation/conf.py
@@ -13,6 +13,7 @@
13	# documentation root, use os.path.abspath to make it absolute, like shown here.	13	# documentation root, use os.path.abspath to make it absolute, like shown here.
14	#	14	#
15	import os	15	import os
		16	import re
16	import sys	17	import sys
17	import datetime	18	import datetime
18	try:	19	try:
@@ -173,6 +174,24 @@ latex_elements = {
173	'preamble': '\\usepackage[UTF8]{ctex}\n\\setcounter{tocdepth}{2}',	174	'preamble': '\\usepackage[UTF8]{ctex}\n\\setcounter{tocdepth}{2}',
174	}	175	}
175		176
		177
		178	from sphinx.search import SearchEnglish
		179	from sphinx.search import languages
		180	class DashFriendlySearchEnglish(SearchEnglish):
		181
		182	# Accept words that can include hyphens
		183	_word_re = re.compile(r'[\w\-]+')
		184
		185	js_splitter_code = """
		186	function splitQuery(query) {
		187	return query
		188	.split(/[^\p{Letter}\p{Number}_\p{Emoji_Presentation}-]+/gu)
		189	.filter(term => term.length > 0);
		190	}
		191	"""
		192
		193	languages['en'] = DashFriendlySearchEnglish
		194
176	# Make the EPUB builder prefer PNG to SVG because of issues rendering Inkscape SVG	195	# Make the EPUB builder prefer PNG to SVG because of issues rendering Inkscape SVG
177	from sphinx.builders.epub3 import Epub3Builder	196	from sphinx.builders.epub3 import Epub3Builder
178	Epub3Builder.supported_image_types = ['image/png', 'image/gif', 'image/jpeg']	197	Epub3Builder.supported_image_types = ['image/png', 'image/gif', 'image/jpeg']