mattermost/api/v4/source/reports.yaml
Scott Bishel b1338853a1
Add cursor-based Posts Reporting API for compliance and auditing (#34252)
* Add cursor-based Posts Reporting API for compliance and auditing

Implements a new admin-only endpoint for retrieving posts with efficient
cursor-based pagination, designed for compliance, auditing, and archival
workflows.

Key Features:
- Cursor-based pagination using composite (time, ID) keys for consistent
  performance regardless of dataset size (~10ms per page at any depth)
- Flexible time range queries with optional upper/lower bounds
- Support for both create_at and update_at time fields
- Ascending or descending sort order
- Optional metadata enrichment (files, reactions, acknowledgements)
- System admin only access (requires manage_system permission)
- License enforcement for compliance features

API Endpoint:
POST /api/v4/reports/posts
- Request: JSON body with channel_id, cursor_time, cursor_id, and options
- Response: Posts map + next_cursor object (null when pagination complete)
- Max page size: 1000 posts per request (MaxReportingPerPage constant)

Implementation:
- Store Layer: Direct SQL queries with composite index on (ChannelId, CreateAt, Id)
- App Layer: Permission checks, optional metadata enrichment, post hooks
- API Layer: Parameter validation, system admin enforcement, license checks
- Data Model: ReportPostOptions, ReportPostOptionsCursor, ReportPostListResponse

Code Quality Improvements:
- Added MaxReportingPerPage constant (1000) to eliminate magic numbers
- Removed unused StartTime field from ReportPostOptions
- Added fmt import for dynamic error messages

Testing:
- 14 comprehensive store layer unit tests
- 12 API layer integration tests covering permissions, pagination, filters
- All tests passing

Documentation:
- POSTS_REPORTING.md: Developer reference with Go structs and usage examples
- POSTS_REPORTING_API_SPEC.md: Complete technical specification
- GET_POSTS_API_IMPROVEMENTS.md: Implementation analysis and design rationale
- POSTS_TIME_RANGE_FEATURE.md: Archived time range feature for future use

Performance:
Cursor-based pagination maintains consistent ~10ms query time at any dataset
depth, compared to offset-based pagination which degrades significantly
(Page 1 = 10ms, Page 1000 = 10 seconds).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* lint fixes

* lint fixes

* gofmt

* i18n-extract

* Add Enterprise license requirement to posts reporting API

Enforce Enterprise license (tier 20+) for the new posts reporting endpoint
to align with compliance feature licensing. Professional tier is insufficient.

Changes:
- Add MinimumEnterpriseLicense check in GetPostsForReporting app layer
- Add test coverage for license validation (no license and Professional tier)

All existing tests pass with new license enforcement.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* i18n-extract

* add licensing to api documentation

* Test SSH signing

* Add mmctl command for posts reporting API

Adds mmctl report posts command to retrieve posts from a channel for
administrative reporting purposes. Supports cursor-based pagination with
configurable sorting, filtering, and time range options.

Includes database migration for updateat+id index to support efficient
cursor-based queries when sorting by update_at.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Refactor posts reporting API cursor to opaque token and improve layer separation

This addresses code review feedback by transforming the cursor from exposed fields
to an opaque token and improving architectural layer separation.

**Key Changes:**

1. **Opaque Cursor Implementation**
   - Transform cursor from split fields (cursor_time, cursor_id) to single opaque base64-encoded string
   - Cursor now self-contained with all query parameters embedded
   - When cursor provided, embedded parameters take precedence over request body
   - Clients treat cursor as opaque token and pass unchanged

2. **Field Naming**
   - Rename ExcludeChannelMetadataSystemPosts → ExcludeSystemPosts
   - Now excludes ALL system posts (any type starting with "system_")
   - Clearer and more consistent naming

3. **Layer Separation**
   - Move cursor decoding from store layer to model layer
   - Create ReportPostQueryParams struct for resolved parameters
   - Store layer receives pre-resolved parameters (no business logic)
   - Add ResolveReportPostQueryParams() function in model layer

4. **Code Quality**
   - Add type-safe constants (ReportingTimeFieldCreateAt, ReportingSortDirectionAsc, etc.)
   - Replace magic number 9223372036854775807 with math.MaxInt64
   - Remove debug SQL logging (info disclosure risk)
   - Update mmctl to use constants and fix NextCursor pointer access

5. **Tests**
   - Update all 17 store test calls to use new resolution pattern
   - Add comprehensive test for DESC + end_time boundary behavior

6. **API Documentation**
   - Update OpenAPI spec to reflect opaque cursor format
   - Update all request/response examples
   - Clarify end_time behavior with sort directions

**Files Changed:**
- Model layer: public/model/post.go
- App layer: channels/app/report.go
- Store layer: channels/store/store.go, channels/store/sqlstore/post_store.go
- Tests: channels/store/storetest/post_store.go
- Mocks: channels/store/storetest/mocks/PostStore.go
- API: channels/api4/report.go, channels/api4/report_test.go
- mmctl: cmd/mmctl/commands/report.go
- Docs: api/v4/source/reports.yaml

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix unhandled parse errors in cursor decoding

Address security finding: cursor decoding was silently ignoring parse errors
from strconv functions, which could lead to unexpected behavior when malformed
cursors are provided.

Changes:
- Add explicit error handling for strconv.Atoi (version parsing)
- Add explicit error handling for strconv.ParseBool (includeDeleted, excludeSystemPosts)
- Add explicit error handling for strconv.ParseInt (timestamp parsing)
- Return clear error messages indicating which field failed to parse

This prevents silent failures where malformed values would default to zero-values
(0, false) and potentially alter query behavior without warning.

Addresses DryRun Security finding: "Unhandled Errors in Cursor Parsing"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix linting issues

- Remove unused reportPostCursorV1 struct (unused)
- Remove obsolete +build comment (buildtag)
- Use maps.Copy instead of manual loop (mapsloop)
- Modernize for loop with range over int (rangeint)
- Apply gofmt formatting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix gofmt formatting issues

Fix alignment in struct literals and constant declarations:
- Align map keys in report_test.go request bodies
- Align struct fields in ReportPostOptions initialization
- Align reporting constant declarations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update mmctl tests for opaque cursor and add i18n translations

Update report_test.go to align with the refactored Posts Reporting API:
- Replace split cursor flags (cursor-time, cursor-id) with single opaque cursor flag
- Update field name: ExcludeChannelMetadataSystemPosts → ExcludeSystemPosts
- Update all mock expectations to use new ReportPostOptionsCursor structure
- Replace test cursor values with base64-encoded opaque cursor strings

Add English translations for cursor decoding error messages in i18n/en.json.

Minor API documentation fix in reports.yaml (remove "all" from description).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update mmctl tests for opaque cursor and add i18n translations

Update report_test.go to align with the refactored Posts Reporting API:
- Replace split cursor flags (cursor-time, cursor-id) with single opaque cursor flag
- Update field name: ExcludeChannelMetadataSystemPosts → ExcludeSystemPosts
- Update all mock expectations to use new ReportPostOptionsCursor structure
- Replace test cursor values with base64-encoded opaque cursor strings

Add English translations for cursor decoding error messages in i18n/en.json.

Minor API documentation fix in reports.yaml (remove "all" from description).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* more lint fixes

* remove index update files

* Remove end_time parameter from Posts Reporting API

Align with other cursor-based APIs in the codebase by removing the end_time
parameter. The caller now controls when to stop pagination by simply not
making another request, which is the same pattern used by GetPostsSinceForSync,
MessageExport, and GetPostsBatchForIndexing.

Changes:
- Remove EndTime field from ReportPostOptions and ReportPostQueryParams
- Remove EndTime filtering logic from store layer
- Remove tests that used end_time parameter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Refactor posts reporting API for security and validation

Address security review feedback by consolidating parameter resolution
and validation in the API layer, with comprehensive validation of all
cursor fields to prevent SQL injection and invalid queries.

Changes:
- Move parameter resolution from model to API layer for clearer separation
- Add ReportPostQueryParams.Validate() with inline validation for all fields
- Validate ChannelId, TimeField, SortDirection, and CursorId format
- Add start_time parameter for time-bounded queries
- Cap per_page at 100-1000 instead of rejecting invalid values
- Export DecodeReportPostCursorV1() for API layer use
- Simplify app layer to receive pre-validated parameters
- Check channel existence when results are empty (better error messages)

Testing:
- Add 10 model tests for validation and malformed cursor scenarios
- Add 4 API tests for cursors with invalid field values
- Refactor 13 store tests to use buildReportPostQueryParams() helper
- All 31 tests pass

Documentation:
- Update OpenAPI spec with start_time, remove unused end_time
- Update markdown docs with start_time examples

Security improvements:
- Whitelist validation prevents SQL injection in TimeField/SortDirection
- Format validation ensures ChannelId and CursorId are valid IDs
- Single validation point for both cursor and options paths
- Defense in depth: validation + parameterized queries + store layer whitelist

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Improve posts reporting query efficiency and safety

Replace SELECT * and nested OR/AND conditions with explicit column
selection and PostgreSQL row value comparison for better performance
and maintainability.

Changes:
- Use postSliceColumns() instead of SELECT * for explicit column selection
- Replace Squirrel OR/AND with row value comparison: (timeField, Id) > (?, ?)
- Use fmt.Sprintf for safer string formatting in WHERE clause

Query improvements:
  Before: WHERE (CreateAt > ?) OR (CreateAt = ? AND Id > ?)
  After:  WHERE (CreateAt, Id) > (?, ?)

Benefits:
- Explicit column selection prevents issues if table schema changes
- Row value comparison is more concise and better optimized by PostgreSQL
- Follows existing patterns in post_store.go (postSliceColumns)
- Standard SQL:2003 syntax

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Change posts reporting response from map to ordered array

Replace the Posts map with an ordered array to preserve query sort order
and provide a more natural API response for sequential processing.

Changes:
- ReportPostListResponse.Posts: map[string]*Post → []*Post
- Store layer returns posts array directly (already sorted by query)
- App layer iterates by index for metadata enrichment
- Remove applyPostsWillBeConsumedHook call (not applicable to reporting)
- Update API tests to iterate arrays instead of map lookups
- Update store tests to convert array to map for deduplication checks
- Remove unused "maps" import

Benefits:
- Preserves query sort order (ASC/DESC, create_at/update_at)
- More natural for sequential processing/export workflows
- Simpler response structure for reporting/compliance use cases
- Aligns with message export/compliance patterns (no plugin hooks)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix linting issues in posts reporting tests

Replace inefficient loops with append(...) for better performance.

Changes:
- Use append(postSlice, result.Posts...) instead of loop
- Simplifies code and follows staticcheck recommendations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix store test AppError nil checking

Use require.Nil instead of require.NoError for *AppError returns
to avoid Go interface nil pointer issues.

When DecodeReportPostCursorV1 returns nil *AppError and it's assigned
to error interface, the interface becomes non-nil even though the
pointer is nil. This causes require.NoError to fail incorrectly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Mattermost Build <build@mattermost.com>
2025-11-17 09:02:19 -07:00

361 lines
14 KiB
YAML

/api/v4/reports/users:
get:
tags:
- reports
summary: Get a list of paged and sorted users for admin reporting purposes
description: >
Get a list of paged users for admin reporting purposes, based on provided parameters.
##### Permissions
Requires `sysconsole_read_user_management_users`.
operationId: GetUsersForReporting
parameters:
- name: sort_column
in: query
description: The column to sort the users by. Must be one of ("CreateAt", "Username", "FirstName", "LastName", "Nickname", "Email") or the API will return an error.
schema:
type: string
default: 'Username'
- name: direction
in: query
description: The direction to accept paging values from. Will return values ahead of the cursor if "prev", and below the cursor if "next". Default is "next".
schema:
type: string
default: 'next'
- name: sort_direction
in: query
description: The sorting direction. Must be one of ("asc", "desc"). Will default to 'asc' if not specified or the input is invalid.
schema:
type: string
default: 'asc'
- name: page_size
in: query
description: The maximum number of users to return.
schema:
type: integer
default: 50
minimum: 1
maximum: 100
- name: from_column_value
in: query
description: The value of the sorted column corresponding to the cursor to read from. Should be blank for the first page asked for.
schema:
type: string
- name: from_id
in: query
description: The value of the user id corresponding to the cursor to read from. Should be blank for the first page asked for.
schema:
type: string
- name: date_range
in: query
description: The date range of the post statistics to display. Must be one of ("last30days", "previousmonth", "last6months", "alltime"). Will default to 'alltime' if the input is not valid.
schema:
type: string
default: 'alltime'
- name: role_filter
in: query
description: Filter users by their role.
schema:
type: string
- name: team_filter
in: query
description: Filter users by a specified team ID.
schema:
type: string
- name: has_no_team
in: query
description: If true, show only users that have no team. Will ignore provided "team_filter" if true.
schema:
type: boolean
- name: hide_active
in: query
description: If true, show only users that are inactive. Cannot be used at the same time as "hide_inactive"
schema:
type: boolean
- name: hide_inactive
in: query
description: If true, show only users that are active. Cannot be used at the same time as "hide_active"
schema:
type: boolean
- name: search_term
in: query
description: A filtering search term that allows filtering by Username, FirstName, LastName, Nickname or Email
schema:
type: string
responses:
"200":
description: User page retrieval successful
content:
application/json:
schema:
type: array
items:
$ref: "#/components/schemas/UserReport"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"500":
$ref: "#/components/responses/InternalServerError"
/api/v4/reports/users/count:
get:
tags:
- reports
summary: Gets the full count of users that match the filter.
description: >
Get the full count of users admin reporting purposes, based on provided parameters.
##### Permissions
Requires `sysconsole_read_user_management_users`.
operationId: GetUserCountForReporting
parameters:
- name: role_filter
in: query
description: Filter users by their role.
schema:
type: string
- name: team_filter
in: query
description: Filter users by a specified team ID.
schema:
type: string
- name: has_no_team
in: query
description: If true, show only users that have no team. Will ignore provided "team_filter" if true.
schema:
type: boolean
- name: hide_active
in: query
description: If true, show only users that are inactive. Cannot be used at the same time as "hide_inactive"
schema:
type: boolean
- name: hide_inactive
in: query
description: If true, show only users that are active. Cannot be used at the same time as "hide_active"
schema:
type: boolean
- name: search_term
in: query
description: A filtering search term that allows filtering by Username, FirstName, LastName, Nickname or Email
schema:
type: string
responses:
"200":
description: User count retrieval successful
content:
application/json:
schema:
type: number
/api/v4/reports/users/export:
post:
tags:
- reports
summary: Starts a job to export the users to a report file.
description: >
Starts a job to export the users to a report file.
##### Permissions
Requires `sysconsole_read_user_management_users`.
operationId: StartBatchUsersExport
parameters:
- name: date_range
in: query
description: The date range of the post statistics to display. Must be one of ("last30days", "previousmonth", "last6months", "alltime"). Will default to 'alltime' if the input is not valid.
schema:
type: string
default: 'alltime'
responses:
"200":
description: Job successfully started
content:
application/json:
schema:
type: array
items:
$ref: "#/components/schemas/UserReport"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"500":
$ref: "#/components/responses/InternalServerError"
/api/v4/reports/posts:
post:
tags:
- reports
summary: Get posts for reporting and compliance purposes using cursor-based pagination
description: >
Get posts from a specific channel for reporting, compliance, and auditing purposes.
This endpoint uses cursor-based pagination to efficiently retrieve large datasets.
The cursor is an opaque, base64-encoded token that contains all pagination state.
Clients should treat the cursor as an opaque string and pass it back unchanged.
When a cursor is provided, query parameters from the initial request are embedded
in the cursor and take precedence over request body parameters.
##### Permissions
Requires `manage_system` permission (system admin only).
##### License
Requires an Enterprise license (or higher).
##### Features
- Cursor-based pagination for efficient large dataset retrieval
- Support for both create_at and update_at time fields
- Ascending or descending sort order
- Time range filtering with optional end_time
- Include/exclude deleted posts
- Exclude system posts (any type starting with "system_")
- Optional metadata enrichment (file info, reactions, emojis, priority, acknowledgements)
operationId: GetPostsForReporting
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- channel_id
properties:
channel_id:
type: string
description: The ID of the channel to retrieve posts from
cursor:
type: string
description: >
Opaque cursor string for pagination. Omit or use empty string for the first request.
For subsequent requests, use the exact cursor value from the previous response's next_cursor.
The cursor is base64-encoded and contains all pagination state including time, post ID,
and query parameters. Do not attempt to parse or modify the cursor value.
default: ""
start_time:
type: integer
format: int64
description: >
Optional start time for query range in Unix milliseconds. Only used for the first request (ignored when cursor is provided).
- For "asc" (ascending): starts retrieving from this time going forward
- For "desc" (descending): starts retrieving from this time going backward
If omitted, defaults to 0 for ascending or MaxInt64 for descending.
time_field:
type: string
enum: [create_at, update_at]
default: create_at
description: >
Which timestamp field to use for sorting and filtering.
Use "create_at" to retrieve posts by creation time, or "update_at" to
retrieve posts by last modification time.
sort_direction:
type: string
enum: [asc, desc]
default: asc
description: >
Sort direction for pagination. Use "asc" to retrieve posts from oldest
to newest, or "desc" to retrieve from newest to oldest.
per_page:
type: integer
minimum: 1
maximum: 1000
default: 100
description: Number of posts to return per page. Maximum 1000.
include_deleted:
type: boolean
default: false
description: >
If true, include posts that have been deleted (DeleteAt > 0).
By default, only non-deleted posts are returned.
exclude_system_posts:
type: boolean
default: false
description: >
If true, exclude all system posts.
include_metadata:
type: boolean
default: false
description: >
If true, enrich posts with additional metadata including file information,
reactions, custom emojis, priority, and acknowledgements. Note that this
may increase response time for large result sets.
examples:
first_page_ascending:
summary: First page, ascending order
value:
channel_id: "4xp9fdt77pncbef59f4k1qe83o"
cursor: ""
per_page: 100
subsequent_page:
summary: Subsequent page using cursor
value:
channel_id: "4xp9fdt77pncbef59f4k1qe83o"
cursor: "MTphYmMxMjM6Y3JlYXRlX2F0OmZhbHNlOmZhbHNlOmFzYzoxNjM1NzI0ODAwMDAwOnBvc3QxMjM"
per_page: 100
time_range_query:
summary: Query with time range starting from specific time
value:
channel_id: "4xp9fdt77pncbef59f4k1qe83o"
cursor: ""
start_time: 1635638400000
per_page: 100
descending_order:
summary: Descending order from recent
value:
channel_id: "4xp9fdt77pncbef59f4k1qe83o"
cursor: ""
sort_direction: "desc"
per_page: 100
responses:
"200":
description: Posts retrieved successfully
content:
application/json:
schema:
type: object
properties:
posts:
type: object
additionalProperties:
$ref: "#/components/schemas/Post"
description: Map of post IDs to post objects
next_cursor:
type: object
nullable: true
description: >
Opaque cursor for retrieving the next page. If null, there are no more pages.
Pass the cursor string from this object in the next request.
properties:
cursor:
type: string
description: Base64-encoded opaque cursor string containing pagination state
examples:
with_more_pages:
summary: Response with more pages available
value:
posts:
"post_id_1": { "id": "post_id_1", "message": "First post", "create_at": 1635638400000 }
"post_id_2": { "id": "post_id_2", "message": "Second post", "create_at": 1635638500000 }
next_cursor:
cursor: "MTphYmMxMjM6Y3JlYXRlX2F0OmZhbHNlOmZhbHNlOmFzYzoxNjM1NjM4NTAwMDAwOnBvc3RfaWRfMg"
last_page:
summary: Last page (no more results)
value:
posts:
"post_id_99": { "id": "post_id_99", "message": "Last post", "create_at": 1635724800000 }
next_cursor: null
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"500":
$ref: "#/components/responses/InternalServerError"