-
Thank you so much for the comprehensive code review and for being on standby during this critical period. Your timing couldn't have been better—I really appreciate the detailed analysis.
Current Status
I did test the script on a dev environment (pulled from production backup), and it successfully reduced the record count as expected. However, I ran into display issues on the dev site due to symlinked storage that I forgot to account for when setting up the test environment. This was compounded by production hitting critical disk space (90%+ full), which pulled my attention away from the test script that I'd already started.
The good news: Production is now stable (55% disk usage) after moving media files to remote NFS storage and fixing some symlink/permission issues. The bad news: I haven't yet run the cleanup script on production due to the symlink complications discovered during testing.
Addressing Your Review Points
You're absolutely right about the high-risk items:
- The
type='post'filter - I need to run your distribution query first. I was making assumptions about the schema without verifying. - Using
bx_srv('bx_timeline', 'delete', [$iId])- This is exactly the kind of version-stable approach I should be using instead of calling internal methods directly. - Web accessibility - The CLI-only check is a must. I hadn't considered this could be triggered accidentally via web.
Next Steps Before Production Run
Before I run this on production, I'll:
- Run the distribution query to verify actual type/action combinations:
sql
SELECT `type`, `action`, COUNT(*) c FROM `bx_timeline_events` GROUP BY `type`, `action` ORDER BY c DESC;- Add the exclusion filters for system/pinned/promoted events (after verifying column names exist in my schema)
- Switch to the documented service call (
bx_srv) instead of direct method calls - Implement CLI-only restriction and improve dry-run logging clarity
- Start with a very small batch (10-20 records) in dry-run mode to verify targeting
Question for You
My production environment is UNACMS 14 (I'll confirm exact version and post it). Once I run the distribution query, would you mind helping me craft the precise WHERE clause for my specific setup?
Also, regarding the symlinked storage issue I encountered in testing—when Timeline events are deleted via
bx_srv('bx_timeline', 'delete', [$iId]), should it properly clean up associated media files even when they're symlinked to remote NFS storage? Or do I need additional cleanup steps?Appreciation
Your tightened-up "safe version" is exactly what I needed. The improvements around:
- Logging type/action/owner for better audit trails
- Excluding system events
- Using documented APIs
- Misleading dry-run wording
...are all spot-on and will make this much more maintainable going forward.
Thanks again for the thorough review and for standing by. I'll post the distribution query results shortly.
Regards Chris
-
Understood on UNA CMS 14.x. Yes—once you paste the (type, action) distribution output, I can help you narrow it to a precise, low-risk
WHEREclause that matches your instance’s real event taxonomy (and avoids collateral deletes).Below is the exact workflow I recommend so the
WHEREclause is grounded in your data, not assumptions.1) Crafting the precise
WHEREclause from your distribution queryStep A — Confirm what you’re actually deleting (taxonomy + sample rows)
Run the distribution query you mentioned:
SELECT `type`, `action`, COUNT(*) AS `cnt` FROM `bx_timeline_events` GROUP BY `type`, `action` ORDER BY `cnt` DESC;Then, for the top 5–10 combos (and any combo you think are user “posts”), pull a small sample to validate they are the intended target:
SELECT `id`, `type`, `action`, `owner_id`, `object_id`, `content`, `date` FROM `bx_timeline_events` WHERE `type` = 'X' AND `action` = 'Y' ORDER BY `date` DESC LIMIT 20;This is what prevents the classic failure mode: deleting things that look like “posts” but are actually system-generated events, module activity, profile changes, etc.
Step B — Build a “candidate delete set” filter, then tighten it
In almost every UNA Timeline cleanup, the final
WHEREends up being:- Age cutoff
- One or more specific (type, action) pairs
- Explicit exclusions for system/admin/sticky/promoted/special events (based on what columns/flags exist in your schema)
A safe template (to be adapted to your actual columns/taxonomy) looks like:
WHERE `date` < {CUTOFF_UNIX_TS} AND ( (`type` = '...' AND `action` = '...') OR (`type` = '...' AND `action` = '...') OR (`type` = '...' AND `action` = '...') ) AND `owner_id` > 0 -- Optional, only if these columns exist in your install: -- AND `system` = 0 -- AND `sticked` = 0 -- AND `promoted` = 0 -- AND `pinned` = 0 ;Step C — Prove the
WHEREclause before deleting anythingBefore you run the deleter, validate with:
SELECT COUNT(*) AS `to_delete` FROM `bx_timeline_events` <WHERE_CLAUSE>;And then eyeball a stratified sample:
SELECT `id`, `type`, `action`, `owner_id`, `date` FROM `bx_timeline_events` <WHERE_CLAUSE> ORDER BY `date` DESC LIMIT 200;If you paste your distribution output, I’ll translate it into the minimal, correct
WHEREfor your environment (including exclusions that match what’s actually present in your schema).2) Symlinked/NFS storage: will
bx_srv('bx_timeline','delete',[$iId])remove media?What should happen
In UNA, deleting via the module service (
bx_srv('bx_timeline','delete',[$eventId])) is the right call because it is expected to:- delete the timeline event row
- delete any module-specific media references
- call into the storage layer to remove stored files (and their DB records)
So, in principle, the service delete is the mechanism that should properly clean up associated media—regardless of whether the underlying storage path is “local” or happens to be a symlink pointing at NFS.
Why symlink/NFS can still leave leftovers
Even with the correct service call, you can still end up with orphaned files if any of the following are true:
- Permission/ownership mismatch on the NFS mount (PHP user can delete DB rows but cannot unlink the remote file).
- realpath/symlink path validation inside the storage engine behaves unexpectedly (less common, but possible in hardened filesystem checks).
- NFS latency / transient I/O errors during unlink cause the physical delete to fail while the DB transaction succeeds (or vice-versa).
- Your timeline attachments are stored in a different storage object than you think (e.g., separate storage objects for photos/videos/files), and some objects are not being invoked by that specific deletion path (version/module variations).
The practical answer: rely on service delete, but add verification + “orphan sweep”
I recommend treating this as two-phase:
- Primary deletion via
bx_srv('bx_timeline','delete', ...)(what you are already doing) - Post-run verification that catches orphaned storage records/files, and optionally deletes them via the storage API (not manual
unlink())
Verification approach (DB-level)
After deleting a small test batch, check whether storage DB rows tied to timeline are decreasing as expected.
Because table/object names can vary by install, the most robust approach is:
- Identify the storage object(s) used by timeline attachments via:
SELECT * FROM `sys_objects_storage` WHERE `object` LIKE '%timeline%';Then for any relevant storage object, inspect its files table (commonly
sys_storage_filesorbx_storage_files, depending on UNA generation/customization). If your install usessys_storage_files, you can usually filter bystorage_idand confirm rows aren’t piling up without owners.If you paste the result of the
sys_objects_storagequery (just the rows that match timeline), I can give you the exact orphan-detection SQL for your schema.If you do need additional cleanup
Do not delete by filesystem path. Instead, delete orphans by calling the storage object’s delete method on file IDs. That ensures thumbnails/transcodings/variants and DB rows are handled correctly.
The “right” cleanup step is therefore:
- find orphaned file IDs (DB query)
- for each file ID, call the storage API delete for that storage object
I can provide a small, safe “orphan sweeper” routine once you confirm:
- the storage object name(s) for timeline, and
- the storage files table name in your install
What to paste next (and I’ll turn it into the exact
WHERE)- Output of:
SELECT `type`, `action`, COUNT(*) AS `cnt` FROM `bx_timeline_events` GROUP BY `type`, `action` ORDER BY `cnt` DESC;- Output of:
SELECT `object`, `engine`, `params` FROM `sys_objects_storage` WHERE `object` LIKE '%timeline%';With those two, I’ll:
- produce the precise
WHEREclause for your setup (including the safest exclusions), and - tell you whether your symlink/NFS case should be fully covered by service delete or whether an orphan sweep is advisable in your environment.
- The