mattermost/server/public/model/post_reporting_test.go
Scott Bishel b1338853a1
Add cursor-based Posts Reporting API for compliance and auditing (#34252)
* Add cursor-based Posts Reporting API for compliance and auditing

Implements a new admin-only endpoint for retrieving posts with efficient
cursor-based pagination, designed for compliance, auditing, and archival
workflows.

Key Features:
- Cursor-based pagination using composite (time, ID) keys for consistent
  performance regardless of dataset size (~10ms per page at any depth)
- Flexible time range queries with optional upper/lower bounds
- Support for both create_at and update_at time fields
- Ascending or descending sort order
- Optional metadata enrichment (files, reactions, acknowledgements)
- System admin only access (requires manage_system permission)
- License enforcement for compliance features

API Endpoint:
POST /api/v4/reports/posts
- Request: JSON body with channel_id, cursor_time, cursor_id, and options
- Response: Posts map + next_cursor object (null when pagination complete)
- Max page size: 1000 posts per request (MaxReportingPerPage constant)

Implementation:
- Store Layer: Direct SQL queries with composite index on (ChannelId, CreateAt, Id)
- App Layer: Permission checks, optional metadata enrichment, post hooks
- API Layer: Parameter validation, system admin enforcement, license checks
- Data Model: ReportPostOptions, ReportPostOptionsCursor, ReportPostListResponse

Code Quality Improvements:
- Added MaxReportingPerPage constant (1000) to eliminate magic numbers
- Removed unused StartTime field from ReportPostOptions
- Added fmt import for dynamic error messages

Testing:
- 14 comprehensive store layer unit tests
- 12 API layer integration tests covering permissions, pagination, filters
- All tests passing

Documentation:
- POSTS_REPORTING.md: Developer reference with Go structs and usage examples
- POSTS_REPORTING_API_SPEC.md: Complete technical specification
- GET_POSTS_API_IMPROVEMENTS.md: Implementation analysis and design rationale
- POSTS_TIME_RANGE_FEATURE.md: Archived time range feature for future use

Performance:
Cursor-based pagination maintains consistent ~10ms query time at any dataset
depth, compared to offset-based pagination which degrades significantly
(Page 1 = 10ms, Page 1000 = 10 seconds).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* lint fixes

* lint fixes

* gofmt

* i18n-extract

* Add Enterprise license requirement to posts reporting API

Enforce Enterprise license (tier 20+) for the new posts reporting endpoint
to align with compliance feature licensing. Professional tier is insufficient.

Changes:
- Add MinimumEnterpriseLicense check in GetPostsForReporting app layer
- Add test coverage for license validation (no license and Professional tier)

All existing tests pass with new license enforcement.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* i18n-extract

* add licensing to api documentation

* Test SSH signing

* Add mmctl command for posts reporting API

Adds mmctl report posts command to retrieve posts from a channel for
administrative reporting purposes. Supports cursor-based pagination with
configurable sorting, filtering, and time range options.

Includes database migration for updateat+id index to support efficient
cursor-based queries when sorting by update_at.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Refactor posts reporting API cursor to opaque token and improve layer separation

This addresses code review feedback by transforming the cursor from exposed fields
to an opaque token and improving architectural layer separation.

**Key Changes:**

1. **Opaque Cursor Implementation**
   - Transform cursor from split fields (cursor_time, cursor_id) to single opaque base64-encoded string
   - Cursor now self-contained with all query parameters embedded
   - When cursor provided, embedded parameters take precedence over request body
   - Clients treat cursor as opaque token and pass unchanged

2. **Field Naming**
   - Rename ExcludeChannelMetadataSystemPosts → ExcludeSystemPosts
   - Now excludes ALL system posts (any type starting with "system_")
   - Clearer and more consistent naming

3. **Layer Separation**
   - Move cursor decoding from store layer to model layer
   - Create ReportPostQueryParams struct for resolved parameters
   - Store layer receives pre-resolved parameters (no business logic)
   - Add ResolveReportPostQueryParams() function in model layer

4. **Code Quality**
   - Add type-safe constants (ReportingTimeFieldCreateAt, ReportingSortDirectionAsc, etc.)
   - Replace magic number 9223372036854775807 with math.MaxInt64
   - Remove debug SQL logging (info disclosure risk)
   - Update mmctl to use constants and fix NextCursor pointer access

5. **Tests**
   - Update all 17 store test calls to use new resolution pattern
   - Add comprehensive test for DESC + end_time boundary behavior

6. **API Documentation**
   - Update OpenAPI spec to reflect opaque cursor format
   - Update all request/response examples
   - Clarify end_time behavior with sort directions

**Files Changed:**
- Model layer: public/model/post.go
- App layer: channels/app/report.go
- Store layer: channels/store/store.go, channels/store/sqlstore/post_store.go
- Tests: channels/store/storetest/post_store.go
- Mocks: channels/store/storetest/mocks/PostStore.go
- API: channels/api4/report.go, channels/api4/report_test.go
- mmctl: cmd/mmctl/commands/report.go
- Docs: api/v4/source/reports.yaml

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix unhandled parse errors in cursor decoding

Address security finding: cursor decoding was silently ignoring parse errors
from strconv functions, which could lead to unexpected behavior when malformed
cursors are provided.

Changes:
- Add explicit error handling for strconv.Atoi (version parsing)
- Add explicit error handling for strconv.ParseBool (includeDeleted, excludeSystemPosts)
- Add explicit error handling for strconv.ParseInt (timestamp parsing)
- Return clear error messages indicating which field failed to parse

This prevents silent failures where malformed values would default to zero-values
(0, false) and potentially alter query behavior without warning.

Addresses DryRun Security finding: "Unhandled Errors in Cursor Parsing"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix linting issues

- Remove unused reportPostCursorV1 struct (unused)
- Remove obsolete +build comment (buildtag)
- Use maps.Copy instead of manual loop (mapsloop)
- Modernize for loop with range over int (rangeint)
- Apply gofmt formatting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix gofmt formatting issues

Fix alignment in struct literals and constant declarations:
- Align map keys in report_test.go request bodies
- Align struct fields in ReportPostOptions initialization
- Align reporting constant declarations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update mmctl tests for opaque cursor and add i18n translations

Update report_test.go to align with the refactored Posts Reporting API:
- Replace split cursor flags (cursor-time, cursor-id) with single opaque cursor flag
- Update field name: ExcludeChannelMetadataSystemPosts → ExcludeSystemPosts
- Update all mock expectations to use new ReportPostOptionsCursor structure
- Replace test cursor values with base64-encoded opaque cursor strings

Add English translations for cursor decoding error messages in i18n/en.json.

Minor API documentation fix in reports.yaml (remove "all" from description).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update mmctl tests for opaque cursor and add i18n translations

Update report_test.go to align with the refactored Posts Reporting API:
- Replace split cursor flags (cursor-time, cursor-id) with single opaque cursor flag
- Update field name: ExcludeChannelMetadataSystemPosts → ExcludeSystemPosts
- Update all mock expectations to use new ReportPostOptionsCursor structure
- Replace test cursor values with base64-encoded opaque cursor strings

Add English translations for cursor decoding error messages in i18n/en.json.

Minor API documentation fix in reports.yaml (remove "all" from description).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* more lint fixes

* remove index update files

* Remove end_time parameter from Posts Reporting API

Align with other cursor-based APIs in the codebase by removing the end_time
parameter. The caller now controls when to stop pagination by simply not
making another request, which is the same pattern used by GetPostsSinceForSync,
MessageExport, and GetPostsBatchForIndexing.

Changes:
- Remove EndTime field from ReportPostOptions and ReportPostQueryParams
- Remove EndTime filtering logic from store layer
- Remove tests that used end_time parameter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Refactor posts reporting API for security and validation

Address security review feedback by consolidating parameter resolution
and validation in the API layer, with comprehensive validation of all
cursor fields to prevent SQL injection and invalid queries.

Changes:
- Move parameter resolution from model to API layer for clearer separation
- Add ReportPostQueryParams.Validate() with inline validation for all fields
- Validate ChannelId, TimeField, SortDirection, and CursorId format
- Add start_time parameter for time-bounded queries
- Cap per_page at 100-1000 instead of rejecting invalid values
- Export DecodeReportPostCursorV1() for API layer use
- Simplify app layer to receive pre-validated parameters
- Check channel existence when results are empty (better error messages)

Testing:
- Add 10 model tests for validation and malformed cursor scenarios
- Add 4 API tests for cursors with invalid field values
- Refactor 13 store tests to use buildReportPostQueryParams() helper
- All 31 tests pass

Documentation:
- Update OpenAPI spec with start_time, remove unused end_time
- Update markdown docs with start_time examples

Security improvements:
- Whitelist validation prevents SQL injection in TimeField/SortDirection
- Format validation ensures ChannelId and CursorId are valid IDs
- Single validation point for both cursor and options paths
- Defense in depth: validation + parameterized queries + store layer whitelist

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Improve posts reporting query efficiency and safety

Replace SELECT * and nested OR/AND conditions with explicit column
selection and PostgreSQL row value comparison for better performance
and maintainability.

Changes:
- Use postSliceColumns() instead of SELECT * for explicit column selection
- Replace Squirrel OR/AND with row value comparison: (timeField, Id) > (?, ?)
- Use fmt.Sprintf for safer string formatting in WHERE clause

Query improvements:
  Before: WHERE (CreateAt > ?) OR (CreateAt = ? AND Id > ?)
  After:  WHERE (CreateAt, Id) > (?, ?)

Benefits:
- Explicit column selection prevents issues if table schema changes
- Row value comparison is more concise and better optimized by PostgreSQL
- Follows existing patterns in post_store.go (postSliceColumns)
- Standard SQL:2003 syntax

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Change posts reporting response from map to ordered array

Replace the Posts map with an ordered array to preserve query sort order
and provide a more natural API response for sequential processing.

Changes:
- ReportPostListResponse.Posts: map[string]*Post → []*Post
- Store layer returns posts array directly (already sorted by query)
- App layer iterates by index for metadata enrichment
- Remove applyPostsWillBeConsumedHook call (not applicable to reporting)
- Update API tests to iterate arrays instead of map lookups
- Update store tests to convert array to map for deduplication checks
- Remove unused "maps" import

Benefits:
- Preserves query sort order (ASC/DESC, create_at/update_at)
- More natural for sequential processing/export workflows
- Simpler response structure for reporting/compliance use cases
- Aligns with message export/compliance patterns (no plugin hooks)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix linting issues in posts reporting tests

Replace inefficient loops with append(...) for better performance.

Changes:
- Use append(postSlice, result.Posts...) instead of loop
- Simplifies code and follows staticcheck recommendations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix store test AppError nil checking

Use require.Nil instead of require.NoError for *AppError returns
to avoid Go interface nil pointer issues.

When DecodeReportPostCursorV1 returns nil *AppError and it's assigned
to error interface, the interface becomes non-nil even though the
pointer is nil. This causes require.NoError to fail incorrectly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Mattermost Build <build@mattermost.com>
2025-11-17 09:02:19 -07:00

246 lines
7.2 KiB
Go

// Copyright (c) 2015-present Mattermost, Inc. All Rights Reserved.
// See LICENSE.txt for license information.
package model
import (
"encoding/base64"
"fmt"
"testing"
"github.com/stretchr/testify/require"
)
func TestReportPostQueryParamsValidate(t *testing.T) {
tests := []struct {
name string
params ReportPostQueryParams
wantError bool
errorMsg string
}{
{
name: "valid params with create_at ASC",
params: ReportPostQueryParams{
ChannelId: NewId(),
CursorTime: 0,
CursorId: "",
TimeField: ReportingTimeFieldCreateAt,
SortDirection: ReportingSortDirectionAsc,
IncludeDeleted: false,
ExcludeSystemPosts: false,
PerPage: 100,
},
wantError: false,
},
{
name: "valid params with update_at DESC and cursor",
params: ReportPostQueryParams{
ChannelId: NewId(),
CursorTime: 123456789,
CursorId: NewId(),
TimeField: ReportingTimeFieldUpdateAt,
SortDirection: ReportingSortDirectionDesc,
IncludeDeleted: true,
ExcludeSystemPosts: true,
PerPage: 1000,
},
wantError: false,
},
{
name: "empty ChannelId",
params: ReportPostQueryParams{
ChannelId: "",
TimeField: ReportingTimeFieldCreateAt,
SortDirection: ReportingSortDirectionAsc,
PerPage: 100,
},
wantError: true,
errorMsg: "channel_id must be a valid 26-character ID",
},
{
name: "invalid ChannelId format",
params: ReportPostQueryParams{
ChannelId: "invalid_id",
TimeField: ReportingTimeFieldCreateAt,
SortDirection: ReportingSortDirectionAsc,
PerPage: 100,
},
wantError: true,
errorMsg: "channel_id must be a valid 26-character ID",
},
{
name: "invalid TimeField",
params: ReportPostQueryParams{
ChannelId: NewId(),
TimeField: "invalid_field",
SortDirection: ReportingSortDirectionAsc,
PerPage: 100,
},
wantError: true,
errorMsg: "time_field must be",
},
{
name: "SQL injection attempt in TimeField",
params: ReportPostQueryParams{
ChannelId: NewId(),
TimeField: "'; DROP TABLE Posts--",
SortDirection: ReportingSortDirectionAsc,
PerPage: 100,
},
wantError: true,
errorMsg: "time_field must be",
},
{
name: "invalid SortDirection",
params: ReportPostQueryParams{
ChannelId: NewId(),
TimeField: ReportingTimeFieldCreateAt,
SortDirection: "random",
PerPage: 100,
},
wantError: true,
errorMsg: "sort_direction must be",
},
{
name: "SQL injection attempt in SortDirection",
params: ReportPostQueryParams{
ChannelId: NewId(),
TimeField: ReportingTimeFieldCreateAt,
SortDirection: "ASC; DROP TABLE Posts--",
PerPage: 100,
},
wantError: true,
errorMsg: "sort_direction must be",
},
{
name: "invalid CursorId format",
params: ReportPostQueryParams{
ChannelId: NewId(),
CursorId: "invalid_cursor_id",
TimeField: ReportingTimeFieldCreateAt,
SortDirection: ReportingSortDirectionAsc,
PerPage: 100,
},
wantError: true,
errorMsg: "cursor_id must be a valid 26-character ID",
},
{
name: "empty CursorId is valid (first page)",
params: ReportPostQueryParams{
ChannelId: NewId(),
CursorId: "",
TimeField: ReportingTimeFieldCreateAt,
SortDirection: ReportingSortDirectionAsc,
PerPage: 100,
},
wantError: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := tt.params.Validate()
if tt.wantError {
require.NotNil(t, err, "expected error but got nil")
require.Contains(t, err.Error(), tt.errorMsg)
} else {
require.Nil(t, err, "expected nil error but got: %v", err)
}
})
}
}
func TestDecodeReportPostCursorV1(t *testing.T) {
validChannelId := NewId()
validPostId := NewId()
tests := []struct {
name string
cursor string
wantError bool
errorMsg string
validate func(*testing.T, *ReportPostQueryParams)
}{
{
name: "valid cursor with create_at ASC",
cursor: EncodeReportPostCursor(validChannelId, ReportingTimeFieldCreateAt, false, false, ReportingSortDirectionAsc, 1640000000000, validPostId),
wantError: false,
validate: func(t *testing.T, params *ReportPostQueryParams) {
require.Equal(t, validChannelId, params.ChannelId)
require.Equal(t, ReportingTimeFieldCreateAt, params.TimeField)
require.Equal(t, ReportingSortDirectionAsc, params.SortDirection)
require.Equal(t, int64(1640000000000), params.CursorTime)
require.Equal(t, validPostId, params.CursorId)
require.False(t, params.IncludeDeleted)
require.False(t, params.ExcludeSystemPosts)
},
},
{
name: "valid cursor with update_at DESC",
cursor: EncodeReportPostCursor(validChannelId, ReportingTimeFieldUpdateAt, true, true, ReportingSortDirectionDesc, 1650000000000, validPostId),
wantError: false,
validate: func(t *testing.T, params *ReportPostQueryParams) {
require.Equal(t, validChannelId, params.ChannelId)
require.Equal(t, ReportingTimeFieldUpdateAt, params.TimeField)
require.Equal(t, ReportingSortDirectionDesc, params.SortDirection)
require.Equal(t, int64(1650000000000), params.CursorTime)
require.Equal(t, validPostId, params.CursorId)
require.True(t, params.IncludeDeleted)
require.True(t, params.ExcludeSystemPosts)
},
},
{
name: "invalid base64",
cursor: "not-valid-base64!!!",
wantError: true,
errorMsg: "invalid_base64",
},
{
name: "invalid format - too few parts",
cursor: base64.URLEncoding.EncodeToString([]byte("1:abc:create_at:false:false:asc:123")),
wantError: true,
errorMsg: "invalid_format",
},
{
name: "invalid version",
cursor: base64.URLEncoding.EncodeToString([]byte("2:abc123xyz789012345678901234:create_at:false:false:asc:1640000000000:post123")),
wantError: true,
errorMsg: "unsupported_version",
},
{
name: "invalid boolean for include_deleted",
cursor: base64.URLEncoding.EncodeToString(fmt.Appendf(nil, "1:%s:create_at:not_bool:false:asc:1640000000000:%s", validChannelId, validPostId)),
wantError: true,
errorMsg: "invalid_include_deleted",
},
{
name: "invalid boolean for exclude_system_posts",
cursor: base64.URLEncoding.EncodeToString(fmt.Appendf(nil, "1:%s:create_at:false:not_bool:asc:1640000000000:%s", validChannelId, validPostId)),
wantError: true,
errorMsg: "invalid_exclude_system_posts",
},
{
name: "invalid timestamp",
cursor: base64.URLEncoding.EncodeToString(fmt.Appendf(nil, "1:%s:create_at:false:false:asc:not_a_number:%s", validChannelId, validPostId)),
wantError: true,
errorMsg: "invalid_timestamp",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
params, err := DecodeReportPostCursorV1(tt.cursor)
if tt.wantError {
require.NotNil(t, err, "expected error but got nil")
require.Contains(t, err.Error(), tt.errorMsg)
require.Nil(t, params)
} else {
require.Nil(t, err, "expected nil error but got: %v", err)
require.NotNil(t, params)
if tt.validate != nil {
tt.validate(t, params)
}
}
})
}
}