v0.0.1
This commit is contained in:
216
CLAUDE.md
216
CLAUDE.md
@@ -4,197 +4,85 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
|
||||
## Project Overview
|
||||
|
||||
This is a SAP C4C (Cloud for Customer) attachment downloader toolkit that retrieves attachments from ServiceRequest tickets and optionally uploads them to Synology DSM NAS. The project consists of:
|
||||
SAP C4C (Cloud for Customer) attachment downloader toolkit that retrieves attachments from ServiceRequest tickets and optionally uploads them to Synology DSM NAS.
|
||||
|
||||
- **Python script** (`sap-c4c-AttachmentFolder.py`): Core downloader using OData APIs and web scraping
|
||||
- **Java wrapper** (`C4CAttachmentDownloader.java`): Java interface that calls the Python script via ProcessBuilder
|
||||
- **DSM upload script** (`dsm-upload.py`): Standalone Synology NAS upload utility
|
||||
|
||||
## Architecture
|
||||
|
||||
### Python Script (`sap-c4c-AttachmentFolder.py`)
|
||||
|
||||
**Core functionality:**
|
||||
1. Authenticates to SAP C4C using Basic Auth
|
||||
2. Fetches ServiceRequest attachments via OData endpoints:
|
||||
- `/sap/c4c/odata/v1/c4codata` - Standard C4C OData API
|
||||
- `/sap/c4c/odata/cust/v1/custticketapi` - Custom ticket API
|
||||
3. Downloads two types of attachments using **multi-threaded concurrent downloads**:
|
||||
- **File attachments** (CategoryCode=2): Downloaded via OData `$value` endpoint
|
||||
- **Link attachments** (CategoryCode=3): External Salesforce links scraped using Scrapling + Playwright
|
||||
4. Handles XIssueItem-level attachments via `BO_XSRIssueItemAttachmentFolder`
|
||||
5. Optionally uploads downloaded files to Synology DSM via FileStation API
|
||||
|
||||
**Key dependencies:**
|
||||
- `requests` - HTTP client for OData/REST APIs
|
||||
- `scrapling[all]` - Web scraping framework with stealth capabilities
|
||||
- `playwright` - Browser automation for downloading Salesforce attachments
|
||||
|
||||
**Performance features:**
|
||||
- Multi-threaded concurrent downloads (default: 5 threads, configurable via `--max-workers`)
|
||||
- Thread-safe output logging with lock mechanism
|
||||
- Parallel processing of both file and link attachments
|
||||
|
||||
**Output modes:**
|
||||
- Human-readable console output (default)
|
||||
- JSON mode (`--json`) for programmatic consumption
|
||||
|
||||
### Java Wrapper (`C4CAttachmentDownloader.java`)
|
||||
|
||||
Provides a type-safe Java API that:
|
||||
- Invokes the Python script via `ProcessBuilder`
|
||||
- Passes credentials via environment variables (more secure than CLI args)
|
||||
- Parses JSON output into strongly-typed Java objects
|
||||
- Supports timeout configuration (default: 30 minutes)
|
||||
|
||||
**Key classes:**
|
||||
- `Result` - Top-level response containing all attachment metadata
|
||||
- `Attachment` - Individual attachment metadata (UUID, filename, MIME type, category)
|
||||
- `IssueItem` - XIssueItem with nested attachments
|
||||
- `DownloadedFile` - Download result with local path and error info
|
||||
- `DsmUploadEntry` - DSM upload result per file
|
||||
|
||||
### DSM Upload (`dsm-upload.py`)
|
||||
|
||||
Standalone script demonstrating Synology FileStation API usage:
|
||||
1. Login via `SYNO.API.Auth` to obtain SID
|
||||
2. Upload files via `SYNO.FileStation.Upload` with SID cookie
|
||||
- **`sap-c4c-AttachmentFolder.py`**: Core downloader (Python >= 3.8) using OData APIs and web scraping
|
||||
- **`C4CAttachmentDownloader.java`**: Java wrapper that calls the Python script via ProcessBuilder
|
||||
- **`dsm-upload.py`**: Standalone Synology NAS upload example
|
||||
|
||||
## Common Commands
|
||||
|
||||
### Python Script
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
pip install requests scrapling[all] playwright
|
||||
python -m playwright install chromium
|
||||
|
||||
# Download attachments (credentials via CLI)
|
||||
# Download attachments
|
||||
python sap-c4c-AttachmentFolder.py \
|
||||
--tenant https://xxx.c4c.saphybriscloud.cn \
|
||||
--user admin \
|
||||
--password xxx \
|
||||
--ticket 24588
|
||||
--user admin --password xxx --ticket 24588
|
||||
|
||||
# Download with custom thread count (default: 5)
|
||||
python sap-c4c-AttachmentFolder.py \
|
||||
--tenant https://xxx.c4c.saphybriscloud.cn \
|
||||
--user admin \
|
||||
--password xxx \
|
||||
--ticket 24588 \
|
||||
--max-workers 10
|
||||
|
||||
# Download with DSM upload
|
||||
python sap-c4c-AttachmentFolder.py \
|
||||
--tenant https://xxx.c4c.saphybriscloud.cn \
|
||||
--user admin \
|
||||
--password xxx \
|
||||
--ticket 24588 \
|
||||
--dsm-url http://10.0.10.235:5000 \
|
||||
--dsm-user PLM \
|
||||
--dsm-password 123456 \
|
||||
--dsm-path /Newgonow/AU-SPFJ
|
||||
|
||||
# JSON mode (for Java/programmatic use)
|
||||
python sap-c4c-AttachmentFolder.py --ticket 24588 --json
|
||||
# Download with custom concurrency (default: 5 threads)
|
||||
python sap-c4c-AttachmentFolder.py --ticket 24588 --max-workers 10
|
||||
|
||||
# List attachments only (no download)
|
||||
python sap-c4c-AttachmentFolder.py --ticket 24588 --list-only
|
||||
|
||||
# Using environment variables for credentials
|
||||
export C4C_TENANT=https://xxx.c4c.saphybriscloud.cn
|
||||
export C4C_USERNAME=admin
|
||||
export C4C_PASSWORD=xxx
|
||||
export DSM_URL=http://10.0.10.235:5000
|
||||
export DSM_USERNAME=PLM
|
||||
export DSM_PASSWORD=123456
|
||||
export DSM_PATH=/Newgonow/AU-SPFJ
|
||||
# JSON mode (for Java/programmatic use)
|
||||
python sap-c4c-AttachmentFolder.py --ticket 24588 --json
|
||||
|
||||
# Download + upload to Synology DSM
|
||||
python sap-c4c-AttachmentFolder.py --ticket 24588 \
|
||||
--dsm-url http://10.0.10.235:5000 --dsm-user PLM \
|
||||
--dsm-password 123456 --dsm-path /Newgonow/AU-SPFJ
|
||||
|
||||
# All credentials also accept environment variables:
|
||||
# C4C_TENANT, C4C_USERNAME, C4C_PASSWORD, DSM_URL, DSM_USERNAME, DSM_PASSWORD, DSM_PATH
|
||||
```
|
||||
|
||||
```java
|
||||
// Java: compile requires Jackson (jackson-databind, jackson-core, jackson-annotations)
|
||||
javac -cp jackson-databind.jar:jackson-core.jar:jackson-annotations.jar C4CAttachmentDownloader.java
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Data Flow
|
||||
|
||||
1. Authenticate to SAP C4C via Basic Auth
|
||||
2. Look up ServiceRequest by ticket ID -> get ObjectID and SerialID
|
||||
3. Fetch SR-level attachments via `/sap/c4c/odata/v1/c4codata/ServiceRequestCollection('{OID}')/ServiceRequestAttachmentFolder`
|
||||
4. Fetch XIssueItem-level attachments via `/sap/c4c/odata/cust/v1/custticketapi/BO_XSRIssueItemAttachmentCollection` (two-step: filter by UUID, then navigate to AttachmentFolder)
|
||||
5. Download concurrently using ThreadPoolExecutor:
|
||||
- **CategoryCode "2"** (file): OData `$value` endpoint or `DocumentLink` URL
|
||||
- **CategoryCode "3"** (link): Scrapling + Playwright opens Salesforce URL, clicks `button.downloadbutton[title='Download']`, captures download
|
||||
6. Optionally upload to Synology DSM via FileStation API, then **auto-delete local files**
|
||||
|
||||
### Two OData Endpoints
|
||||
|
||||
- `/sap/c4c/odata/v1/c4codata` (`ODATA_C4C`) - Standard C4C OData for ServiceRequest and SR-level attachments
|
||||
- `/sap/c4c/odata/cust/v1/custticketapi` (`ODATA_CUST`) - Custom ticket API for XIssueItem and its attachments
|
||||
|
||||
### Java Wrapper
|
||||
|
||||
```java
|
||||
// Compile (requires Jackson for JSON parsing)
|
||||
javac -cp jackson-databind.jar:jackson-core.jar:jackson-annotations.jar C4CAttachmentDownloader.java
|
||||
Invokes Python script with `--json` flag, passes credentials via **environment variables** (not CLI args for security). Parses JSON into typed classes: `Result`, `Attachment`, `IssueItem`, `DownloadedFile`, `DsmUploadEntry`. Default timeout: 30 minutes.
|
||||
|
||||
// Basic usage
|
||||
C4CAttachmentDownloader downloader = new C4CAttachmentDownloader(
|
||||
"/path/to/sap-c4c-AttachmentFolder.py",
|
||||
"https://xxx.c4c.saphybriscloud.cn",
|
||||
"admin",
|
||||
"password"
|
||||
);
|
||||
### DSM Upload Directory Structure
|
||||
|
||||
// List attachments only
|
||||
C4CAttachmentDownloader.Result result = downloader.listAttachments("24588");
|
||||
- SR attachments: `{DSM_PATH}/{ticketID}_{serialID}/{filename}`
|
||||
- IssueItem attachments: `{DSM_PATH}/{ticketID}_{serialID}/{issueID}/{filename}`
|
||||
|
||||
// Download to default directory
|
||||
C4CAttachmentDownloader.Result result = downloader.download("24588");
|
||||
### Concurrency Model
|
||||
|
||||
// Download to specific directory
|
||||
C4CAttachmentDownloader.Result result = downloader.download("24588", "/tmp/ticket_24588");
|
||||
Multi-threaded via `ThreadPoolExecutor` (default 5, `--max-workers`). Both file and link downloads are submitted as futures. Thread-safe console output uses a `print_lock`. The `requests.Session` is shared across file-download threads (thread-safe). Scrapling/Playwright link downloads each launch their own browser.
|
||||
|
||||
// Download with DSM upload
|
||||
downloader.setDsmConfig("http://10.0.10.235:5000", "PLM", "123456", "/Newgonow/AU-SPFJ");
|
||||
C4CAttachmentDownloader.Result result = downloader.download("24588", "/tmp/ticket_24588");
|
||||
```
|
||||
### Global State
|
||||
|
||||
## Key Implementation Details
|
||||
|
||||
### Attachment Categories
|
||||
|
||||
SAP C4C uses `CategoryCode` to distinguish attachment types:
|
||||
- **"2"** = File attachment (binary content stored in C4C, downloaded via OData `$value`)
|
||||
- **"3"** = Link attachment (external URL, typically Salesforce links requiring web scraping)
|
||||
|
||||
### OData Navigation Paths
|
||||
|
||||
**ServiceRequest attachments:**
|
||||
```
|
||||
/ServiceRequestCollection('{ObjectID}')/ServiceRequestAttachmentFolder
|
||||
```
|
||||
|
||||
**XIssueItem attachments (two-step navigation):**
|
||||
```
|
||||
1. /BO_XSRIssueItemAttachmentCollection?$filter=XIssueItemUUID eq guid'{uuid}'
|
||||
2. /BO_XSRIssueItemAttachmentCollection('{ObjectID}')/BO_XSRIssueItemAttachmentFolder
|
||||
```
|
||||
|
||||
### Scrapling Download Strategy
|
||||
|
||||
For CategoryCode=3 (link attachments), the script:
|
||||
1. Opens the Salesforce link in a headless Chromium browser
|
||||
2. Waits for `button.downloadbutton[title='Download']` selector
|
||||
3. Clicks the button and captures the download
|
||||
4. Saves with original or suggested filename
|
||||
|
||||
### Security Considerations
|
||||
|
||||
- Java wrapper passes credentials via **environment variables** (not CLI args) to avoid exposure in process lists
|
||||
- Python script supports both CLI args and environment variables
|
||||
- DSM API uses session-based authentication (SID cookie)
|
||||
- SSL verification disabled (`verify=False`) - consider enabling in production
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
.
|
||||
├── C4CAttachmentDownloader.java # Java wrapper with typed API
|
||||
├── sap-c4c-AttachmentFolder.py # Core Python downloader
|
||||
├── dsm-upload.py # Standalone DSM upload example
|
||||
└── downloads/ # Default output directory
|
||||
```
|
||||
The Python script uses module-level globals (`TENANT`, `USERNAME`, `PASSWORD`, `ODATA_C4C`, `ODATA_CUST`, `OUTPUT_DIR`, `DSM_*`, `MAX_WORKERS`) initialized in `main()`. The `run()` function is the core entry point returning a structured dict.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Playwright not installed:**
|
||||
```bash
|
||||
python -m playwright install chromium
|
||||
```
|
||||
|
||||
**Timeout errors:** Increase timeout in Java wrapper constructor (default 30 minutes) or adjust Scrapling timeout parameters.
|
||||
|
||||
**DSM upload fails:** Verify DSM URL, credentials, and that target path exists or `create_parents=true` is set.
|
||||
|
||||
**Link download fails:** Check that Salesforce page structure matches expected selector (`button.downloadbutton[title='Download']`). Update `download_link_via_scrapling()` if page structure changes.
|
||||
- **Playwright not installed**: `python -m playwright install chromium`
|
||||
- **Link download fails**: Salesforce page selector `button.downloadbutton[title='Download']` may have changed; update `download_link_via_scrapling()`
|
||||
- **Timeout**: Increase Java wrapper timeout or Scrapling's `timeout` param (currently 60s page load, 120s download wait)
|
||||
- **SSL warnings**: `verify=False` is used throughout; `urllib3` warnings are suppressed
|
||||
|
||||
Reference in New Issue
Block a user