customGPT.ai - Jahia indexer icon
Module Id
customgpt-ai
Group Id
org.jahia.modules.community
Updated
Requires Jahia
8.2.3.0
Author
Florent BOURASSE
Category
Business and Commerce
Status
COMMUNITY info

customGPT.ai - Jahia indexer group_work

ai

Jahia module that integrates with the CustomGPT.ai API to index Jahia site content (pages and files) into a CustomGPT project, keeping it in sync with JCR publish/unpublish events.

Dependencies & Dependants

Changelog 1.0.4

New features

  • Token-bucket rate limiting — the RateLimitInterceptor now enforces a configurable maximum request rate (default 10 req/s) using a token-bucket algorithm. Tokens are refilled lazily from elapsed wall-clock time; no background thread is needed. All API calls — indexing and purge alike — share the same bucket.
  • Configurable rate via Admin UI and OSGi config — the new rateLimit.requestsPerSecond property (org.jahia.community.modules.customgpt.rateLimit.requestsPerSecond) is exposed in the settings panel next to Batch Size. It can also be set directly in the .cfg file. Note: the OkHttp client reads the value at startup; a module restart is required for changes to take effect.

Improvements

  • Exponential back-off on HTTP 429 — replaced the previous single fixed-delay retry (1 s, one attempt) with up to 3 retries using full-jitter exponential back-off (base 1 s, cap 30 s). The Retry-After response header is honoured when present, overriding the computed delay.
  • Streaming-batch purge — purgeAllPages no longer loads every page ID into memory before starting deletions. It now fetches one API result page, deletes those IDs concurrently, then re-fetches until the first page comes back empty. Memory usage is bounded to one result page (typically 15–20 IDs) regardless of project size.
  • Capped purge thread pool — the deleteAllPages executor is sized to min(batchSize, requestsPerSecond) instead of batchSize, preventing the creation of hundreds of idle threads when the batch size is large.

Bug fixes

  • Fixed pagination drift during purge: the previous implementation followed next_page_url while deleting, which silently skipped items when offset-based pagination shifted after deletions. The new loop always re-queries the first page URL after each deletion round, guaranteeing no items are missed.

Upgrade notes

  • The new rateLimit.requestsPerSecond property defaults to 10. Existing deployments that relied on the previous fixed 500–1000 ms jitter between requests will now send up to 10 requests per second; adjust the property if the CustomGPT.ai plan has stricter limits.
  • No JCR schema changes. No migration script required.

Full Changelog1_0_3...1_0_4

FAQ

Configuration

The module uses the OSGi config PID org.jahia.community.modules.customgpt.
Drop a .cfg file in $JAHIA_HOME/digital-factory-data/karaf/etc/ or edit from the Admin UI:

Property Default Description
projectId (empty) CustomGPT project ID
token (empty) CustomGPT API Bearer token
apiBaseUrl https://app.customgpt.ai/api/v1/ CustomGPT API base URL
content.indexedMainResourceTypes jnt:page,jmix:mainResource Comma-separated main resource node types to index
content.indexedSubNodeTypes jmix:droppableContent Comma-separated sub-node types whose text content is included
content.indexedFileExtensions pdf Comma-separated file extensions to index
operations.batch.size 500 Batch size for concurrent deletions and indexing jobs
jahia.username (empty) Jahia user for rendering pages during indexing
jahia.password (empty) Jahia password for the rendering user
jahia.serverCookie.name/value/domain (empty) Optional server cookie injected during rendering
dryRun true When true, simulate indexing without calling CustomGPT
scheduleJobASAP false When true, schedule indexing jobs immediately; auto-resets to false after jobs are queued

Admin UI

Navigate to Jahia Administration → CustomGPT.ai (/jahia/administration/customgptAiSettings).

The panel allows:

  • Editing all configuration properties
  • Viewing the CustomGPT project name (resolved live from the API)
  • Saving settings (writes the OSGi config file)
  • Purge All Pages — deletes every page registered in the CustomGPT project (irreversible, requires confirmation)

GraphQL API

All operations are exposed under the admin.customGpt namespace.

Queries

  • admin.customGpt.settings — read all settings (including projectName resolved from the API)
  • admin.customGpt.listSites — list indexed sites and their indexation status

Mutations

  • admin.customGpt.addSite(siteKey) — register a site for indexing (adds jmix:customGptIndexableSite mixin)
  • admin.customGpt.saveSettings(...) — persist settings to OSGi config
  • admin.customGpt.startIndex(siteKeys, force) — trigger full-site indexing (all sites if siteKeys omitted)
  • admin.customGpt.startNodeIndex(nodePaths, inclDescendants) — trigger indexing for specific nodes
  • admin.customGpt.purgeAllPages — delete all pages in the CustomGPT project; returns the number of pages deleted

JCR Data Model

Each indexed node gets a customgptIndex child node (type jnt:customGptIndexEntry) storing the CustomGPT pageId as a string property. This replaces the legacy jmix:customGptIndexed mixin approach.

Migration from legacy mixins

Run scripts/cleanup-legacy-customgpt-mixins.groovy from the Jahia Groovy console to remove the old jmix:customGptIndexed / jmix:customGptFileIndexed mixins and the customGptPageId property from all nodes in both EDIT and LIVE workspaces.

How To Install

Installation

  • In Jahia, go to "Administration --> Server settings --> System components --> Modules"
  • Upload the JAR customgpt-ai-X.X.X.jar
  • Check that the module is started

Configuration

  • Edit the file JAHIA_HOME/digital-factory-data/karaf/etc/org.jahia.community.modules.customgpt.cfg

License

MIT License

Copyright (c) 2025 - present Florent BOURASSÉ

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.