-
Notifications
You must be signed in to change notification settings - Fork 0
Automated Test: embed-url-handling-post #327
Conversation
📝 WalkthroughWalkthroughThis pull request introduces a discussion embedding feature for Discourse. It adds gems for feed handling, creates a new EmbedController with authentication, defines TopicEmbed model for tracking embedded topics, implements background jobs for RSS polling and topic retrieval, adds client-side embedding JavaScript, and introduces configuration settings for embedding hosts and feed sources. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant EmbedController
participant TopicRetriever
participant TopicEmbed
participant Database
Client->>EmbedController: GET /embed/best?embed_url=...
EmbedController->>EmbedController: Validate embeddable_host & referer
EmbedController->>TopicEmbed: topic_id_for_embed(embed_url)
TopicEmbed->>Database: Query topic_embeds
Database-->>TopicEmbed: Return topic_id or nil
TopicEmbed-->>EmbedController: Return topic_id
alt Topic exists
EmbedController->>EmbedController: Create TopicView
EmbedController-->>Client: Render best.html.erb
else Topic not found
EmbedController->>TopicRetriever: Queue retrieve job
EmbedController-->>Client: Render loading.html.erb
TopicRetriever->>TopicRetriever: Validate host & throttle
TopicRetriever->>TopicEmbed: import_remote(user, url)
TopicEmbed->>TopicEmbed: Fetch via Readability
TopicEmbed->>TopicEmbed: Absolutize URLs
TopicEmbed->>Database: Create Topic & Post
Database-->>TopicEmbed: Return created post
end
sequenceDiagram
participant Scheduler
participant PollFeed
participant SimpleRSS
participant TopicEmbed
participant Database
Scheduler->>PollFeed: execute()
PollFeed->>PollFeed: Check feed_polling_enabled, feed_polling_url, embed_by_username
alt Settings valid
PollFeed->>PollFeed: Resolve embedding user by username
PollFeed->>SimpleRSS: Parse feed from feed_polling_url
SimpleRSS-->>PollFeed: Return feed items
loop For each feed item
PollFeed->>PollFeed: Extract URL, title, content
PollFeed->>PollFeed: Unescape HTML entities
PollFeed->>TopicEmbed: import(user, url, title, content)
TopicEmbed->>Database: Create/Update topic & post
Database-->>TopicEmbed: Confirm
end
else Settings missing
PollFeed->>PollFeed: Return early
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
🤖 Fix all issues with AI agents
In `@app/assets/javascripts/embed.js`:
- Around line 5-12: Guard against a missing container and ensure the discourse
URL has a trailing slash: check the result of
document.getElementById('discourse-comments') (variable comments) is non-null
before calling comments.appendChild(iframe) and bail or log if missing; also
normalize discourseUrl when building iframe.src (the variable used with
discourseEmbedUrl) so it always ends with '/' (or insert one when concatenating)
to avoid malformed URLs. Ensure references: comments, iframe, discourseUrl,
discourseEmbedUrl, document.getElementById, and appendChild.
In `@app/controllers/embed_controller.rb`:
- Around line 24-31: The ensure_embeddable method currently sets a non-standard
X-Frame-Options value and enforces a brittle Referer host check; remove the
response.headers['X-Frame-Options'] = "ALLOWALL" line (do not emit an invalid
X-Frame-Options value) and instead, if you need explicit framing rules, set a
proper Content-Security-Policy frame-ancestors header or omit frame headers
entirely; also make the referer validation (the URI(request.referer || '').host
comparison against SiteSetting.embeddable_host in ensure_embeddable) a soft
check—treat missing/invalid Referer as allowed (or log and continue) rather than
raising Discourse::InvalidAccess, since Referer is spoofable/omitted and should
not be a hard security boundary.
- Around line 8-20: The cache header discourse_expires_in 1.minute is applied
unconditionally in the best action, causing the loading view to be cached; move
the discourse_expires_in call into the branch where
TopicEmbed.topic_id_for_embed(embed_url) returns a topic_id (i.e., inside the if
topic_id block, after assigning `@topic_view`) so only the real topic response is
cached, or alternatively set a much shorter expiry when rendering 'loading' in
the else branch (where Jobs.enqueue(:retrieve_topic, ...) and render 'loading'
occur) to avoid caching the loading page for a full minute.
In `@app/jobs/scheduled/poll_feed.rb`:
- Around line 35-36: i.content may be nil causing .scrub to raise; update the
extraction before calling CGI.unescapeHTML and TopicEmbed.import to guard
against nil by selecting the first available body (prefer i.content, fallback to
i.description, then i.summary), call .to_s or ensure non-nil before .scrub, and
then pass the sanitized content variable into TopicEmbed.import(user, url,
i.title, content). Ensure the symbol references i.content, i.description,
i.summary, .scrub, CGI.unescapeHTML, and TopicEmbed.import are used so the
change is applied at the correct spot.
- Line 29: Replace the unsafe Kernel#open call in the SimpleRSS.parse call with
an explicit URI.open/OpenURI.open_uri call to avoid command injection; update
the line using SimpleRSS.parse open(SiteSetting.feed_polling_url) to use
URI.open(SiteSetting.feed_polling_url) (or OpenURI.open_uri) and add require
'open-uri' at the top of poll_feed.rb if not already present so the intent is
explicit and Brakeman warnings are resolved.
In `@app/models/topic_embed.rb`:
- Line 48: The code uses open(url) to fetch external content which allows
command execution and SSRF; in app/models/topic_embed.rb replace Kernel#open
with URI.open and add URL validation before fetching: parse the url, ensure an
allowed scheme (http or https), reject or whitelist hosts/IPs (prevent private
IPs/metadata addresses), and only then call URI.open to read the body for
Readability::Document.new (reference: Readability::Document and the local
variable url). Ensure exceptions from invalid URLs or blocked destinations are
handled and logged.
- Line 13: Replace the in-place mutation and raw HTML interpolation: stop using
the destructive operator on contents (change the `contents << ...` usage in
topic_embed.rb to build a new string instead, e.g. assign back with `contents =
contents + ...` or `contents += ...`) and HTML-escape the `url` before embedding
it into the anchor tag (use ERB::Util.html_escape or CGI.escapeHTML on `url` and
insert that escaped value into the generated string). Ensure the final string
you assign is the safe HTML you intend to store/return.
In `@app/views/embed/best.html.erb`:
- Line 6: The ERB template has an invalid Ruby post-condition clause "<%- end if
%>" which causes a SyntaxError; replace that token with a standard ERB block
terminator "<% end %>" to correctly close the preceding if block (i.e., change
"<%- end if %>" -> "<% end %>" in the template).
- Around line 15-18: The img tag currently interpolates
post.user.small_avatar_url directly into src which can break HTML if the URL
contains single quotes or malicious characters; replace the raw tag with Rails'
safe helper by using image_tag(post.user.small_avatar_url, alt:
post.user.username) or at minimum wrap the value with ERB escaping
(h(post.user.small_avatar_url)) so the src attribute is properly escaped; update
the view around the author block (where post.user and small_avatar_url are used)
to use image_tag or h(...) and include an alt attribute for accessibility.
- Line 19: The embed view renders unsanitized HTML because the raw_html cook
method (Post.rb:133) returns raw content and TopicEmbed.import (called from
poll_feed.rb:36) consumes RSS content without sanitization; fix by ensuring
external HTML is sanitized via Discourse's Sanitizer before it reaches
TopicEmbed.import or by removing/replacing the raw_html cook method so the
normal sanitization pipeline is used. Locate the raw_html method in Post.rb and
either delete/disable it or change callers (e.g., in poll_feed.rb and
TopicEmbed.import) to call Sanitizer.sanitize (or the appropriate
Discourse::Sanitization API) on the decoded RSS content before passing to
TopicEmbed.import/TopicEmbed.import_remote, and confirm post.cooked is produced
by the standard, sanitized cook pipeline rather than returned raw.
In `@app/views/embed/loading.html.erb`:
- Around line 6-11: The page currently does an unbounded reload via the IIFE
that calls setTimeout -> document.location.reload(); modify this logic to
implement a bounded retry counter (e.g., stored in sessionStorage under a key
like "embedReloadAttempts" or encoded as a query parameter) so the IIFE reads
the counter, increments it on each scheduled reload, and only calls
document.location.reload() while attempts < MAX_ATTEMPTS (then stop or show an
error). Ensure the same unique symbols are updated: the self-invoking function,
the setTimeout invocation, and the document.location.reload() call so the reload
only happens up to the configured limit and the counter is persisted across
reloads.
In `@app/views/layouts/embed.html.erb`:
- Around line 6-15: The template is interpolating request.referer directly into
the JS postMessage call which risks XSS and breaks when nil; update the
embed.html.erb code that calls parent.postMessage to JS-escape the referer using
Rails' escape_javascript (or the j helper) when building the target origin and
provide a sensible fallback when request.referer is nil (for example use "*" or
derive origin from the embeddable_host site setting) so parent.postMessage
receives a valid, safely-escaped origin string.
In `@db/migrate/20131217174004_create_topic_embeds.rb`:
- Line 3: The migration currently uses create_table :topic_embeds, force: true
which will drop an existing table silently; remove the force: true option from
the create_table call (so that create_table :topic_embeds do |t| ...) to prevent
destructive automatic drops and allow the migration to fail if the table already
exists; if you want safe idempotence instead, replace the force option by
guarding with a conditional check like table_exists?(:topic_embeds) or wrap the
creation in a conditional, updating the migration method that contains
create_table :topic_embeds to ensure it no longer uses force: true.
In `@db/migrate/20131223171005_create_top_topics.rb`:
- Line 3: The migration currently passes force: true to create_table :top_topics
which will silently drop and recreate the table (risking data loss); remove
force: true and instead guard creation (e.g., check table_exists?(:top_topics)
and only call create_table when it does not exist, or use create_table with
if_not_exists: true if your Rails version supports it), keeping the table
name/top_topics and the existing create_table block unchanged otherwise.
In `@lib/topic_retriever.rb`:
- Around line 27-30: Replace the non-atomic two-step lock (calling
$redis.setnx(retrieved_key, "1") then $redis.expire(retrieved_key, 60)) with a
single atomic Redis SET using the EX and NX options so the TTL is applied
atomically; locate the block around the retrieved_key check in
lib/topic_retriever.rb and change the logic to use $redis.set with EX:60 and NX
semantics (treat success the same as setnx returning true) so the key cannot be
left without a TTL if the process crashes.
🧹 Nitpick comments (19)
app/assets/stylesheets/embed.css.scss (1)
14-36: Extract the shared author-column width into an SCSS variable.The
92pxwidth appears in both.author(line 16) and.cookedmargin-left(line 31). If one changes without the other, the layout breaks. A shared variable eliminates this risk.♻️ Suggested refactor
+$author-column-width: 92px; + article.post { border-bottom: 1px solid `#ddd`; .post-date { float: right; color: `#aaa`; font-size: 12px; margin: 4px 4px 0 0; } .author { padding: 20px 0; - width: 92px; + width: $author-column-width; float: left; text-align: center; h3 { text-align: center; color: `#4a6b82`; font-size: 13px; margin: 0; } } .cooked { padding: 20px 0; - margin-left: 92px; + margin-left: $author-column-width; p { margin: 0 0 1em 0; } } }Gemfile (1)
209-211: Version pinning for new gems is optional and consistent with this Gemfile's pattern.
ruby-readability(0.7.3) andsimple-rss(2.1.0) are both stable, well-maintained gems with millions of downloads—not abandonment-ware. While pinning versions is a best practice, most gems in this Gemfile don't have version constraints, so leaving these unpinned is consistent with the project's existing approach.app/assets/javascripts/embed.js (1)
15-24: Origin check is loose — consider a stricter comparison.
discourseUrl.indexOf(e.origin)accepts any origin that is a substring ofdiscourseUrl. While browser-sete.originlimits exploitation, a strict comparison against the parsed origin ofdiscourseUrlwould be more robust.♻️ Suggested stricter origin check
function postMessageReceived(e) { if (!e) { return; } - if (discourseUrl.indexOf(e.origin) === -1) { return; } + var a = document.createElement('a'); + a.href = discourseUrl; + if (a.origin !== e.origin) { return; } if (e.data) {db/migrate/20131217174004_create_topic_embeds.rb (1)
4-5: Consider adding an index ontopic_idfor efficient lookups.Queries joining or filtering by
topic_id(e.g., finding the embed record for a given topic) will benefit from an index.post_idmay also warrant one depending on query patterns.♻️ Add index
add_index :topic_embeds, :embed_url, unique: true + add_index :topic_embeds, :topic_idconfig/routes.rb (1)
245-246: Route addition looks correct.The route maps cleanly to
embed#best. Note: Brakeman flags that Rails 3.2.12 with globbing routes is vulnerable to directory traversal and RCE (CVE-2014-0130). While this specific route doesn't use globbing, upgrading Rails to at least 3.2.18 is strongly recommended to mitigate this framework-level vulnerability.spec/jobs/poll_feed_spec.rb (2)
12-28: Negative tests may not isolate each condition independently.Each test stubs only one setting but leaves the others unstubbed (relying on their defaults). If
executeshort-circuits on the first falsy condition, then e.g. thefeed_polling_urltest (line 18–22) might never actually checkfeed_polling_urlbecausefeed_polling_enabled?already defaults tofalse. Consider stubbing the other two settings to their "happy path" values in each negative test so that only the condition under test causes the early return.Example for the `feed_polling_url` test
it "requires feed_polling_url" do + SiteSetting.stubs(:feed_polling_enabled?).returns(true) SiteSetting.stubs(:feed_polling_url).returns(nil) + SiteSetting.stubs(:embed_by_username).returns(embed_by_username) poller.expects(:poll_feed).never poller.execute({}) endApply the same pattern to the other two negative tests.
5-6: Extra empty lines flagged by RuboCop.Static analysis flagged extra empty lines at block body beginning (line 5) and end (line 39). Remove them to satisfy
Layout/EmptyLinesAroundBlockBody.Proposed fix
describe Jobs::PollFeed do - let(:poller) { Jobs::PollFeed.new }end end - endAlso applies to: 39-40
lib/post_creator.rb (1)
212-216:cook_methodcould be consolidated into the pass-through attributes list.Line 216 duplicates the exact pattern used in the loop on lines 212–214. You could simply add
:cook_methodto the array to reduce repetition.Proposed change
- [:post_type, :no_bump, :cooking_options, :image_sizes, :acting_user, :invalidate_oneboxes].each do |a| + [:post_type, :no_bump, :cooking_options, :image_sizes, :acting_user, :invalidate_oneboxes, :cook_method].each do |a| post.send("#{a}=", `@opts`[a]) if `@opts`[a].present? end - - post.cook_method = `@opts`[:cook_method] if `@opts`[:cook_method].present?spec/components/topic_retriever_spec.rb (2)
21-25: Invalid-URL test doesn't stubembeddable_host— may pass for the wrong reason.If
TopicRetriever#retrievechecksembeddable_hostbefore parsing the URL, this test passes because the host check fails (nil), not because the URL is invalid. Stubembeddable_hostto a value so the URL-validation path is actually exercised.Proposed fix
it "does not call perform_retrieve when the embed url is not a url" do + SiteSetting.expects(:embeddable_host).returns("eviltrout.com") r = TopicRetriever.new("not a url") r.expects(:perform_retrieve).never r.retrieve end
5-6: Extra empty lines flagged by RuboCop.Remove the extra blank lines at block body beginning (line 5) and end (lines 43, 45) per
Layout/EmptyLinesAroundBlockBody.Also applies to: 43-46
spec/controllers/embed_controller_spec.rb (2)
35-38: Usingafterfor assertions is an anti-pattern.Assertions in
afterblocks run even when the test body raises, producing confusing double-failures. Move these checks into eachitblock (or use a sharedafteronly for teardown).Proposed refactor — inline assertions
- after do - response.should be_success - response.headers['X-Frame-Options'].should == "ALLOWALL" - end - it "tells the topic retriever to work when no previous embed is found" do TopicEmbed.expects(:topic_id_for_embed).returns(nil) retriever = mock TopicRetriever.expects(:new).returns(retriever) retriever.expects(:retrieve) get :best, embed_url: embed_url + response.should be_success + response.headers['X-Frame-Options'].should == "ALLOWALL" end it "creates a topic view when a topic_id is found" do TopicEmbed.expects(:topic_id_for_embed).returns(123) TopicView.expects(:new).with(123, nil, {best: 5}) get :best, embed_url: embed_url + response.should be_success + response.headers['X-Frame-Options'].should == "ALLOWALL" end
4-5: Extra empty lines flagged by RuboCop.Lines 4, 30, 54, and 57 have extra blank lines per
Layout/EmptyLinesAroundBlockBody. Remove them to satisfy the convention.Also applies to: 54-58
spec/models/topic_embed_spec.rb (3)
34-34:start_with?assertion gives opaque failure messages.If this assertion fails, RSpec will only report "expected true, got false" without showing the actual
cookedvalue. Useshould start_with(...)orshould include(...)instead so failures display the actual vs. expected content.Proposed fix
- post.cooked.start_with?("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">").should be_true + post.cooked.should start_with("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">")
3-48: Missing test coverage forimport_remote,topic_id_for_embed, andabsolutize_urls.The AI summary and other specs (
embed_controller_spec,topic_retriever_spec) referenceTopicEmbed.topic_id_for_embedandTopicEmbed.import_remote, but neither has a dedicated test here.absolutize_urlsis only tested indirectly. Consider adding direct unit tests for these class methods, especiallytopic_id_for_embedwhich is used in the controller.Would you like me to generate test stubs for these methods or open an issue to track this?
4-5: Extra empty lines flagged by RuboCop.Multiple
Layout/EmptyLinesAroundBlockBodyviolations at lines 4, 10, 12, 43, 45, and 47. Remove the extra blank lines.Also applies to: 10-12, 43-48
app/jobs/regular/retrieve_topic.rb (1)
1-1: Unusedrequire_dependency.
email/senderis never referenced in this file. This appears to be a copy-paste artifact.Proposed fix
-require_dependency 'email/sender' require_dependency 'topic_retriever'app/jobs/scheduled/poll_feed.rb (1)
20-22:feed_keyis defined but never called — dead code.This method computes a cache key but is not referenced anywhere in this file or, per the summary, elsewhere. If it's intended for future use (e.g., ETag/Last-Modified caching), consider removing it until then.
lib/topic_retriever.rb (1)
40-43: Synchronously executing the entirePollFeedjob is expensive and fragile.
Jobs::PollFeed.new.execute({})runs the full RSS poll inline—fetching and importing all feed items—just to check if one specificembed_urlhappens to be in the feed. Any error during the poll (network timeout, malformed feed) will propagate up and abort the single-topic retrieval. Consider either enqueuingPollFeedasynchronously or extracting a targeted lookup for the specific URL.app/models/topic_embed.rb (1)
1-1: Userequireinstead ofrequire_dependencyfor gems.
require_dependencyis a Rails autoloading mechanism for application code. Gems like Nokogiri should be loaded with plainrequire.Proposed fix
-require_dependency 'nokogiri' +require 'nokogiri'
| var comments = document.getElementById('discourse-comments'), | ||
| iframe = document.createElement('iframe'); | ||
| iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl); | ||
| iframe.id = 'discourse-embed-frame'; | ||
| iframe.width = "100%"; | ||
| iframe.frameBorder = "0"; | ||
| iframe.scrolling = "no"; | ||
| comments.appendChild(iframe); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential NPE if #discourse-comments element is missing.
If document.getElementById('discourse-comments') returns null, the appendChild call on Line 12 will throw a TypeError. Consider adding a guard.
Also, Line 7: if discourseUrl doesn't end with a trailing /, the constructed URL will be malformed (e.g., "https://example.comembed/best?...").
🐛 Proposed fix
var comments = document.getElementById('discourse-comments'),
iframe = document.createElement('iframe');
- iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
+ iframe.src = discourseUrl + (discourseUrl[discourseUrl.length-1] === '/' ? '' : '/') + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
iframe.id = 'discourse-embed-frame';
iframe.width = "100%";
iframe.frameBorder = "0";
iframe.scrolling = "no";
- comments.appendChild(iframe);
+ if (comments) { comments.appendChild(iframe); }🤖 Prompt for AI Agents
In `@app/assets/javascripts/embed.js` around lines 5 - 12, Guard against a missing
container and ensure the discourse URL has a trailing slash: check the result of
document.getElementById('discourse-comments') (variable comments) is non-null
before calling comments.appendChild(iframe) and bail or log if missing; also
normalize discourseUrl when building iframe.src (the variable used with
discourseEmbedUrl) so it always ends with '/' (or insert one when concatenating)
to avoid malformed URLs. Ensure references: comments, iframe, discourseUrl,
discourseEmbedUrl, document.getElementById, and appendChild.
| def best | ||
| embed_url = params.require(:embed_url) | ||
| topic_id = TopicEmbed.topic_id_for_embed(embed_url) | ||
|
|
||
| if topic_id | ||
| @topic_view = TopicView.new(topic_id, current_user, {best: 5}) | ||
| else | ||
| Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url) | ||
| render 'loading' | ||
| end | ||
|
|
||
| discourse_expires_in 1.minute | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
best action: cache header is set even on the loading render path — verify this is intentional.
discourse_expires_in 1.minute on Line 19 runs unconditionally, including when the topic hasn't been fetched yet and the loading view is rendered. This means the "loading" response could be cached for up to 1 minute, delaying the user from seeing the actual content once the background job completes. Consider moving the cache directive inside the if topic_id branch, or using a much shorter expiry for the loading path.
Proposed fix
def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)
if topic_id
`@topic_view` = TopicView.new(topic_id, current_user, {best: 5})
+ discourse_expires_in 1.minute
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end
-
- discourse_expires_in 1.minute
end📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def best | |
| embed_url = params.require(:embed_url) | |
| topic_id = TopicEmbed.topic_id_for_embed(embed_url) | |
| if topic_id | |
| @topic_view = TopicView.new(topic_id, current_user, {best: 5}) | |
| else | |
| Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url) | |
| render 'loading' | |
| end | |
| discourse_expires_in 1.minute | |
| end | |
| def best | |
| embed_url = params.require(:embed_url) | |
| topic_id = TopicEmbed.topic_id_for_embed(embed_url) | |
| if topic_id | |
| `@topic_view` = TopicView.new(topic_id, current_user, {best: 5}) | |
| discourse_expires_in 1.minute | |
| else | |
| Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url) | |
| render 'loading' | |
| end | |
| end |
🤖 Prompt for AI Agents
In `@app/controllers/embed_controller.rb` around lines 8 - 20, The cache header
discourse_expires_in 1.minute is applied unconditionally in the best action,
causing the loading view to be cached; move the discourse_expires_in call into
the branch where TopicEmbed.topic_id_for_embed(embed_url) returns a topic_id
(i.e., inside the if topic_id block, after assigning `@topic_view`) so only the
real topic response is cached, or alternatively set a much shorter expiry when
rendering 'loading' in the else branch (where Jobs.enqueue(:retrieve_topic, ...)
and render 'loading' occur) to avoid caching the loading page for a full minute.
| def ensure_embeddable | ||
| raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank? | ||
| raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host | ||
|
|
||
| response.headers['X-Frame-Options'] = "ALLOWALL" | ||
| rescue URI::InvalidURIError | ||
| raise Discourse::InvalidAccess.new('invalid referer host') | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
X-Frame-Options: ALLOWALL is not a standard directive; referer check is easily bypassed.
Two observations on ensure_embeddable:
-
Line 28:
ALLOWALLis not a validX-Frame-Optionsvalue (onlyDENY,SAMEORIGIN, andALLOW-FROMare defined in RFC 7034). Browsers will ignore the unrecognized value, which effectively allows framing—so the intent works, but only by accident. Consider omitting the header entirely or usingContent-Security-Policy: frame-ancestors *for explicit semantics. -
Line 26: The
Referer-based host check is trivially spoofable and will also reject legitimate requests from browsers/proxies that strip theRefererheader (privacy settings, HTTPS→HTTP transitions). This is acceptable as a soft hotlink guard for public content, but should not be relied upon as a security boundary.
🤖 Prompt for AI Agents
In `@app/controllers/embed_controller.rb` around lines 24 - 31, The
ensure_embeddable method currently sets a non-standard X-Frame-Options value and
enforces a brittle Referer host check; remove the
response.headers['X-Frame-Options'] = "ALLOWALL" line (do not emit an invalid
X-Frame-Options value) and instead, if you need explicit framing rules, set a
proper Content-Security-Policy frame-ancestors header or omit frame headers
entirely; also make the referer validation (the URI(request.referer || '').host
comparison against SiteSetting.embeddable_host in ensure_embeddable) a soft
check—treat missing/invalid Referer as allowed (or log and continue) rather than
raising Discourse::InvalidAccess, since Referer is spoofable/omitted and should
not be a hard security boundary.
| return if user.blank? | ||
|
|
||
| require 'simple-rss' | ||
| rss = SimpleRSS.parse open(SiteSetting.feed_polling_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
open() on admin-supplied URL — use URI.open to avoid Kernel#open command injection.
Brakeman's flag is valid here. In Ruby versions before 3.0, Kernel#open interprets strings starting with | as shell commands. While SiteSetting.feed_polling_url is admin-controlled, using URI.open (or OpenURI.open_uri) makes the intent explicit and eliminates the risk entirely.
Proposed fix
- rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
+ rss = SimpleRSS.parse URI.open(SiteSetting.feed_polling_url)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rss = SimpleRSS.parse open(SiteSetting.feed_polling_url) | |
| rss = SimpleRSS.parse URI.open(SiteSetting.feed_polling_url) |
🧰 Tools
🪛 Brakeman (8.0.1)
[medium] 29-29: Model attribute used in file name
Type: File Access
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/file_access/
(File Access)
🤖 Prompt for AI Agents
In `@app/jobs/scheduled/poll_feed.rb` at line 29, Replace the unsafe Kernel#open
call in the SimpleRSS.parse call with an explicit URI.open/OpenURI.open_uri call
to avoid command injection; update the line using SimpleRSS.parse
open(SiteSetting.feed_polling_url) to use URI.open(SiteSetting.feed_polling_url)
(or OpenURI.open_uri) and add require 'open-uri' at the top of poll_feed.rb if
not already present so the intent is explicit and Brakeman warnings are
resolved.
| content = CGI.unescapeHTML(i.content.scrub) | ||
| TopicEmbed.import(user, url, i.title, content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.content may be nil for feed items without a <content> element, causing NoMethodError on .scrub.
Not all RSS/Atom items have a content field (some only have description or summary). When i.content is nil, calling .scrub on it will raise NoMethodError. Add a nil guard or fall back to i.description / i.summary.
Proposed fix
- content = CGI.unescapeHTML(i.content.scrub)
+ raw_content = i.content || i.description || i.summary || ''
+ content = CGI.unescapeHTML(raw_content.scrub)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| content = CGI.unescapeHTML(i.content.scrub) | |
| TopicEmbed.import(user, url, i.title, content) | |
| raw_content = i.content || i.description || i.summary || '' | |
| content = CGI.unescapeHTML(raw_content.scrub) | |
| TopicEmbed.import(user, url, i.title, content) |
🤖 Prompt for AI Agents
In `@app/jobs/scheduled/poll_feed.rb` around lines 35 - 36, i.content may be nil
causing .scrub to raise; update the extraction before calling CGI.unescapeHTML
and TopicEmbed.import to guard against nil by selecting the first available body
(prefer i.content, fallback to i.description, then i.summary), call .to_s or
ensure non-nil before .scrub, and then pass the sanitized content variable into
TopicEmbed.import(user, url, i.title, content). Ensure the symbol references
i.content, i.description, i.summary, .scrub, CGI.unescapeHTML, and
TopicEmbed.import are used so the change is applied at the correct spot.
| <script> | ||
| (function() { | ||
| setTimeout(function() { | ||
| document.location.reload(); | ||
| }, 30000); | ||
| })(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unbounded auto-reload loop — no retry limit.
If the embedded content never becomes available, this page will reload every 30 seconds indefinitely, wasting bandwidth and server resources. Consider adding a retry counter (e.g., via a query parameter or sessionStorage) and stopping after a reasonable number of attempts.
💡 Sketch of a bounded retry
<script>
(function() {
+ var maxRetries = 10;
+ var retryCount = parseInt(sessionStorage.getItem('embed_retry') || '0', 10);
+ if (retryCount >= maxRetries) {
+ sessionStorage.removeItem('embed_retry');
+ return;
+ }
+ sessionStorage.setItem('embed_retry', retryCount + 1);
setTimeout(function() {
document.location.reload();
}, 30000);
})();
</script>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <script> | |
| (function() { | |
| setTimeout(function() { | |
| document.location.reload(); | |
| }, 30000); | |
| })(); | |
| <script> | |
| (function() { | |
| var maxRetries = 10; | |
| var retryCount = parseInt(sessionStorage.getItem('embed_retry') || '0', 10); | |
| if (retryCount >= maxRetries) { | |
| sessionStorage.removeItem('embed_retry'); | |
| return; | |
| } | |
| sessionStorage.setItem('embed_retry', retryCount + 1); | |
| setTimeout(function() { | |
| document.location.reload(); | |
| }, 30000); | |
| })(); | |
| </script> |
🤖 Prompt for AI Agents
In `@app/views/embed/loading.html.erb` around lines 6 - 11, The page currently
does an unbounded reload via the IIFE that calls setTimeout ->
document.location.reload(); modify this logic to implement a bounded retry
counter (e.g., stored in sessionStorage under a key like "embedReloadAttempts"
or encoded as a query parameter) so the IIFE reads the counter, increments it on
each scheduled reload, and only calls document.location.reload() while attempts
< MAX_ATTEMPTS (then stop or show an error). Ensure the same unique symbols are
updated: the self-invoking function, the setTimeout invocation, and the
document.location.reload() call so the reload only happens up to the configured
limit and the counter is persisted across reloads.
| <script> | ||
| (function() { | ||
| window.onload = function() { | ||
| if (parent) { | ||
| // Send a post message with our loaded height | ||
| parent.postMessage({type: 'discourse-resize', height: document['body'].offsetHeight}, '<%= request.referer %>'); | ||
| } | ||
| } | ||
| })(); | ||
| </script> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
request.referer injected into JS context — potential XSS and nil risk.
request.referer is user-controlled and is being interpolated directly into a JavaScript string literal. While Rails' <%= %> HTML-escapes the output, HTML escaping is not the correct encoding for JavaScript string contexts. Use escape_javascript (or the j helper) to properly escape for JS.
Additionally, if request.referer is nil (Referer header can be absent), the output will be an empty string, causing postMessage to fail.
🐛 Proposed fix
- parent.postMessage({type: 'discourse-resize', height: document['body'].offsetHeight}, '<%= request.referer %>');
+ var ref = '<%= j(request.referer || "*") %>';
+ parent.postMessage({type: 'discourse-resize', height: document['body'].offsetHeight}, ref);Note: Using "*" as the fallback target origin allows any origin to receive the message. If this is too permissive, consider deriving the origin from the embeddable_host site setting instead.
🤖 Prompt for AI Agents
In `@app/views/layouts/embed.html.erb` around lines 6 - 15, The template is
interpolating request.referer directly into the JS postMessage call which risks
XSS and breaks when nil; update the embed.html.erb code that calls
parent.postMessage to JS-escape the referer using Rails' escape_javascript (or
the j helper) when building the target origin and provide a sensible fallback
when request.referer is nil (for example use "*" or derive origin from the
embeddable_host site setting) so parent.postMessage receives a valid,
safely-escaped origin string.
| @@ -0,0 +1,13 @@ | |||
| class CreateTopicEmbeds < ActiveRecord::Migration | |||
| def change | |||
| create_table :topic_embeds, force: true do |t| | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
force: true will silently drop the table if it already exists — risky for production.
If this migration is ever re-run or the table exists from a prior partial deploy, force: true destroys all existing data without warning. Remove it to let the migration fail safely if the table already exists.
🐛 Proposed fix
- create_table :topic_embeds, force: true do |t|
+ create_table :topic_embeds do |t|📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| create_table :topic_embeds, force: true do |t| | |
| create_table :topic_embeds do |t| |
🤖 Prompt for AI Agents
In `@db/migrate/20131217174004_create_topic_embeds.rb` at line 3, The migration
currently uses create_table :topic_embeds, force: true which will drop an
existing table silently; remove the force: true option from the create_table
call (so that create_table :topic_embeds do |t| ...) to prevent destructive
automatic drops and allow the migration to fail if the table already exists; if
you want safe idempotence instead, replace the force option by guarding with a
conditional check like table_exists?(:topic_embeds) or wrap the creation in a
conditional, updating the migration method that contains create_table
:topic_embeds to ensure it no longer uses force: true.
| @@ -1,6 +1,6 @@ | |||
| class CreateTopTopics < ActiveRecord::Migration | |||
| def change | |||
| create_table :top_topics do |t| | |||
| create_table :top_topics, force: true do |t| | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
force: true silently drops the table if it already exists — risk of data loss.
Adding force: true to create_table in a migration means re-running this migration (e.g., db:migrate:redo) will drop and recreate top_topics, destroying all rows. While top_topics is a derived/cache table and can be recomputed, this is a dangerous pattern in migrations. If this was added to work around an "already exists" error, prefer a guard like create_table ... unless table_exists?(:top_topics) or fix the root cause.
🤖 Prompt for AI Agents
In `@db/migrate/20131223171005_create_top_topics.rb` at line 3, The migration
currently passes force: true to create_table :top_topics which will silently
drop and recreate the table (risking data loss); remove force: true and instead
guard creation (e.g., check table_exists?(:top_topics) and only call
create_table when it does not exist, or use create_table with if_not_exists:
true if your Rails version supports it), keeping the table name/top_topics and
the existing create_table block unchanged otherwise.
| if $redis.setnx(retrieved_key, "1") | ||
| $redis.expire(retrieved_key, 60) | ||
| return false | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition: SETNX + EXPIRE is not atomic — key can persist forever if the process crashes between the two calls.
If the process dies after setnx succeeds but before expire runs, the key will have no TTL and the URL will be permanently throttled. Use a single atomic SET with EX and NX flags.
Proposed fix
- if $redis.setnx(retrieved_key, "1")
- $redis.expire(retrieved_key, 60)
- return false
- end
+ return false if $redis.set(retrieved_key, "1", ex: 60, nx: true)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if $redis.setnx(retrieved_key, "1") | |
| $redis.expire(retrieved_key, 60) | |
| return false | |
| end | |
| return false if $redis.set(retrieved_key, "1", ex: 60, nx: true) |
🤖 Prompt for AI Agents
In `@lib/topic_retriever.rb` around lines 27 - 30, Replace the non-atomic two-step
lock (calling $redis.setnx(retrieved_key, "1") then $redis.expire(retrieved_key,
60)) with a single atomic Redis SET using the EX and NX options so the TTL is
applied atomically; locate the block around the retrieved_key check in
lib/topic_retriever.rb and change the logic to use $redis.set with EX:60 and NX
semantics (treat success the same as setnx returning true) so the key cannot be
left without a TTL if the process crashes.
This pull request was automatically created by
@coderabbitai/e2e-reviewer.Batch created pull request.
Summary by CodeRabbit