Skip to content

Automated Test: embed-url-handling-post #327

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,10 @@ gem 'unicorn', require: false
gem 'puma', require: false
gem 'rbtrace', require: false

# required for feed importing and embedding
gem 'ruby-readability', require: false
gem 'simple-rss', require: false

# perftools only works on 1.9 atm
group :profile do
# travis refuses to install this, instead of fuffing, just avoid it for now
Expand Down
7 changes: 7 additions & 0 deletions Gemfile_rails4.lock
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ GEM
fspath (2.0.5)
given_core (3.1.1)
sorcerer (>= 0.3.7)
guess_html_encoding (0.0.9)
handlebars-source (1.1.2)
hashie (2.0.5)
highline (1.6.20)
Expand Down Expand Up @@ -309,6 +310,9 @@ GEM
rspec-mocks (~> 2.14.0)
ruby-hmac (0.4.0)
ruby-openid (2.3.0)
ruby-readability (0.5.7)
guess_html_encoding (>= 0.0.4)
nokogiri (>= 1.4.2)
sanitize (2.0.6)
nokogiri (>= 1.4.4)
sass (3.2.12)
Expand Down Expand Up @@ -337,6 +341,7 @@ GEM
celluloid (>= 0.14.1)
ice_cube (~> 0.11.0)
sidekiq (~> 2.15.0)
simple-rss (1.3.1)
simplecov (0.7.1)
multi_json (~> 1.0)
simplecov-html (~> 0.7.1)
Expand Down Expand Up @@ -466,6 +471,7 @@ DEPENDENCIES
rinku
rspec-given
rspec-rails
ruby-readability
sanitize
sass
sass-rails
Expand All @@ -474,6 +480,7 @@ DEPENDENCIES
sidekiq (= 2.15.1)
sidekiq-failures
sidetiq (>= 0.3.6)
simple-rss
simplecov
sinatra
slim
Expand Down
27 changes: 27 additions & 0 deletions app/assets/javascripts/embed.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
/* global discourseUrl */
/* global discourseEmbedUrl */
(function() {

var comments = document.getElementById('discourse-comments'),
iframe = document.createElement('iframe');
iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
iframe.id = 'discourse-embed-frame';
iframe.width = "100%";
iframe.frameBorder = "0";
iframe.scrolling = "no";
comments.appendChild(iframe);
Comment on lines +5 to +12

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential NPE if #discourse-comments element is missing.

If document.getElementById('discourse-comments') returns null, the appendChild call on Line 12 will throw a TypeError. Consider adding a guard.

Also, Line 7: if discourseUrl doesn't end with a trailing /, the constructed URL will be malformed (e.g., "https://example.comembed/best?...").

🐛 Proposed fix
  var comments = document.getElementById('discourse-comments'),
      iframe = document.createElement('iframe');
- iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
+ iframe.src = discourseUrl + (discourseUrl[discourseUrl.length-1] === '/' ? '' : '/') + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
  iframe.id = 'discourse-embed-frame';
  iframe.width = "100%";
  iframe.frameBorder = "0";
  iframe.scrolling = "no";
- comments.appendChild(iframe);
+ if (comments) { comments.appendChild(iframe); }
🤖 Prompt for AI Agents
In `@app/assets/javascripts/embed.js` around lines 5 - 12, Guard against a missing
container and ensure the discourse URL has a trailing slash: check the result of
document.getElementById('discourse-comments') (variable comments) is non-null
before calling comments.appendChild(iframe) and bail or log if missing; also
normalize discourseUrl when building iframe.src (the variable used with
discourseEmbedUrl) so it always ends with '/' (or insert one when concatenating)
to avoid malformed URLs. Ensure references: comments, iframe, discourseUrl,
discourseEmbedUrl, document.getElementById, and appendChild.



function postMessageReceived(e) {
if (!e) { return; }
if (discourseUrl.indexOf(e.origin) === -1) { return; }

if (e.data) {
if (e.data.type === 'discourse-resize' && e.data.height) {
iframe.height = e.data.height + "px";
}
}
}
window.addEventListener('message', postMessageReceived, false);

})();
69 changes: 69 additions & 0 deletions app/assets/stylesheets/embed.css.scss
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
//= require ./vendor/normalize
//= require ./common/foundation/base

article.post {
border-bottom: 1px solid #ddd;

.post-date {
float: right;
color: #aaa;
font-size: 12px;
margin: 4px 4px 0 0;
}

.author {
padding: 20px 0;
width: 92px;
float: left;

text-align: center;

h3 {
text-align: center;
color: #4a6b82;
font-size: 13px;
margin: 0;
}
}

.cooked {
padding: 20px 0;
margin-left: 92px;

p {
margin: 0 0 1em 0;
}
}
}

header {
padding: 10px 10px 20px 10px;

font-size: 18px;

border-bottom: 1px solid #ddd;
}

footer {
font-size: 18px;

.logo {
margin-right: 10px;
margin-top: 10px;
}

a[href].button {
margin: 10px 0 0 10px;
}
}

.logo {
float: right;
max-height: 30px;
}

a[href].button {
background-color: #eee;
padding: 5px;
display: inline-block;
}
34 changes: 34 additions & 0 deletions app/controllers/embed_controller.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
class EmbedController < ApplicationController
skip_before_filter :check_xhr
skip_before_filter :preload_json
before_filter :ensure_embeddable

layout 'embed'

def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)

if topic_id
@topic_view = TopicView.new(topic_id, current_user, {best: 5})
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end

discourse_expires_in 1.minute
end
Comment on lines +8 to +20

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

best action: cache header is set even on the loading render path — verify this is intentional.

discourse_expires_in 1.minute on Line 19 runs unconditionally, including when the topic hasn't been fetched yet and the loading view is rendered. This means the "loading" response could be cached for up to 1 minute, delaying the user from seeing the actual content once the background job completes. Consider moving the cache directive inside the if topic_id branch, or using a much shorter expiry for the loading path.

Proposed fix
   def best
     embed_url = params.require(:embed_url)
     topic_id = TopicEmbed.topic_id_for_embed(embed_url)
 
     if topic_id
       `@topic_view` = TopicView.new(topic_id, current_user, {best: 5})
+      discourse_expires_in 1.minute
     else
       Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
       render 'loading'
     end
-
-    discourse_expires_in 1.minute
   end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)
if topic_id
@topic_view = TopicView.new(topic_id, current_user, {best: 5})
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end
discourse_expires_in 1.minute
end
def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)
if topic_id
`@topic_view` = TopicView.new(topic_id, current_user, {best: 5})
discourse_expires_in 1.minute
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end
end
🤖 Prompt for AI Agents
In `@app/controllers/embed_controller.rb` around lines 8 - 20, The cache header
discourse_expires_in 1.minute is applied unconditionally in the best action,
causing the loading view to be cached; move the discourse_expires_in call into
the branch where TopicEmbed.topic_id_for_embed(embed_url) returns a topic_id
(i.e., inside the if topic_id block, after assigning `@topic_view`) so only the
real topic response is cached, or alternatively set a much shorter expiry when
rendering 'loading' in the else branch (where Jobs.enqueue(:retrieve_topic, ...)
and render 'loading' occur) to avoid caching the loading page for a full minute.


private

def ensure_embeddable
raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank?
raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host

response.headers['X-Frame-Options'] = "ALLOWALL"
rescue URI::InvalidURIError
raise Discourse::InvalidAccess.new('invalid referer host')
end
Comment on lines +24 to +31

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

X-Frame-Options: ALLOWALL is not a standard directive; referer check is easily bypassed.

Two observations on ensure_embeddable:

  1. Line 28: ALLOWALL is not a valid X-Frame-Options value (only DENY, SAMEORIGIN, and ALLOW-FROM are defined in RFC 7034). Browsers will ignore the unrecognized value, which effectively allows framing—so the intent works, but only by accident. Consider omitting the header entirely or using Content-Security-Policy: frame-ancestors * for explicit semantics.

  2. Line 26: The Referer-based host check is trivially spoofable and will also reject legitimate requests from browsers/proxies that strip the Referer header (privacy settings, HTTPS→HTTP transitions). This is acceptable as a soft hotlink guard for public content, but should not be relied upon as a security boundary.

🤖 Prompt for AI Agents
In `@app/controllers/embed_controller.rb` around lines 24 - 31, The
ensure_embeddable method currently sets a non-standard X-Frame-Options value and
enforces a brittle Referer host check; remove the
response.headers['X-Frame-Options'] = "ALLOWALL" line (do not emit an invalid
X-Frame-Options value) and instead, if you need explicit framing rules, set a
proper Content-Security-Policy frame-ancestors header or omit frame headers
entirely; also make the referer validation (the URI(request.referer || '').host
comparison against SiteSetting.embeddable_host in ensure_embeddable) a soft
check—treat missing/invalid Referer as allowed (or log and continue) rather than
raising Discourse::InvalidAccess, since Referer is spoofable/omitted and should
not be a hard security boundary.



end
24 changes: 24 additions & 0 deletions app/jobs/regular/retrieve_topic.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
require_dependency 'email/sender'
require_dependency 'topic_retriever'

module Jobs

# Asynchronously retrieve a topic from an embedded site
class RetrieveTopic < Jobs::Base

def execute(args)
raise Discourse::InvalidParameters.new(:embed_url) unless args[:embed_url].present?

user = nil
if args[:user_id]
user = User.where(id: args[:user_id]).first
end

TopicRetriever.new(args[:embed_url], no_throttle: user.try(:staff?)).retrieve
end

end

end


41 changes: 41 additions & 0 deletions app/jobs/scheduled/poll_feed.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#
# Creates and Updates Topics based on an RSS or ATOM feed.
#
require 'digest/sha1'
require_dependency 'post_creator'
require_dependency 'post_revisor'
require 'open-uri'

module Jobs
class PollFeed < Jobs::Scheduled
recurrence { hourly }
sidekiq_options retry: false

def execute(args)
poll_feed if SiteSetting.feed_polling_enabled? &&
SiteSetting.feed_polling_url.present? &&
SiteSetting.embed_by_username.present?
end

def feed_key
@feed_key ||= "feed-modified:#{Digest::SHA1.hexdigest(SiteSetting.feed_polling_url)}"
end

def poll_feed
user = User.where(username_lower: SiteSetting.embed_by_username.downcase).first
return if user.blank?

require 'simple-rss'
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

open() on admin-supplied URL — use URI.open to avoid Kernel#open command injection.

Brakeman's flag is valid here. In Ruby versions before 3.0, Kernel#open interprets strings starting with | as shell commands. While SiteSetting.feed_polling_url is admin-controlled, using URI.open (or OpenURI.open_uri) makes the intent explicit and eliminates the risk entirely.

Proposed fix
-      rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
+      rss = SimpleRSS.parse URI.open(SiteSetting.feed_polling_url)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
rss = SimpleRSS.parse URI.open(SiteSetting.feed_polling_url)
🧰 Tools
🪛 Brakeman (8.0.1)

[medium] 29-29: Model attribute used in file name
Type: File Access
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/file_access/

(File Access)

🤖 Prompt for AI Agents
In `@app/jobs/scheduled/poll_feed.rb` at line 29, Replace the unsafe Kernel#open
call in the SimpleRSS.parse call with an explicit URI.open/OpenURI.open_uri call
to avoid command injection; update the line using SimpleRSS.parse
open(SiteSetting.feed_polling_url) to use URI.open(SiteSetting.feed_polling_url)
(or OpenURI.open_uri) and add require 'open-uri' at the top of poll_feed.rb if
not already present so the intent is explicit and Brakeman warnings are
resolved.


rss.items.each do |i|
url = i.link
url = i.id if url.blank? || url !~ /^https?\:\/\//

content = CGI.unescapeHTML(i.content.scrub)
TopicEmbed.import(user, url, i.title, content)
Comment on lines +35 to +36

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

i.content may be nil for feed items without a <content> element, causing NoMethodError on .scrub.

Not all RSS/Atom items have a content field (some only have description or summary). When i.content is nil, calling .scrub on it will raise NoMethodError. Add a nil guard or fall back to i.description / i.summary.

Proposed fix
-        content = CGI.unescapeHTML(i.content.scrub)
+        raw_content = i.content || i.description || i.summary || ''
+        content = CGI.unescapeHTML(raw_content.scrub)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
content = CGI.unescapeHTML(i.content.scrub)
TopicEmbed.import(user, url, i.title, content)
raw_content = i.content || i.description || i.summary || ''
content = CGI.unescapeHTML(raw_content.scrub)
TopicEmbed.import(user, url, i.title, content)
🤖 Prompt for AI Agents
In `@app/jobs/scheduled/poll_feed.rb` around lines 35 - 36, i.content may be nil
causing .scrub to raise; update the extraction before calling CGI.unescapeHTML
and TopicEmbed.import to guard against nil by selecting the first available body
(prefer i.content, fallback to i.description, then i.summary), call .to_s or
ensure non-nil before .scrub, and then pass the sanitized content variable into
TopicEmbed.import(user, url, i.title, content). Ensure the symbol references
i.content, i.description, i.summary, .scrub, CGI.unescapeHTML, and
TopicEmbed.import are used so the change is applied at the correct spot.

end
end

end
end
9 changes: 9 additions & 0 deletions app/models/post.rb
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,10 @@ def self.types
@types ||= Enum.new(:regular, :moderator_action)
end

def self.cook_methods
@cook_methods ||= Enum.new(:regular, :raw_html)
end

def self.find_by_detail(key, value)
includes(:post_details).where(post_details: { key: key, value: value }).first
end
Expand Down Expand Up @@ -124,6 +128,11 @@ def post_analyzer
end

def cook(*args)
# For some posts, for example those imported via RSS, we support raw HTML. In that
# case we can skip the rendering pipeline.
return raw if cook_method == Post.cook_methods[:raw_html]

# Default is to cook posts
Plugin::Filter.apply(:after_post_cook, self, post_analyzer.cook(*args))
end

Expand Down
82 changes: 82 additions & 0 deletions app/models/topic_embed.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
require_dependency 'nokogiri'

class TopicEmbed < ActiveRecord::Base
belongs_to :topic
belongs_to :post
validates_presence_of :embed_url
validates_presence_of :content_sha1

# Import an article from a source (RSS/Atom/Other)
def self.import(user, url, title, contents)
return unless url =~ /^https?\:\/\//

contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Mutates the caller's contents string in-place and interpolates an unescaped URL into HTML.

Two issues on this line:

  1. contents << ... uses in-place mutation (<<), so the caller's original string is permanently modified as a side effect. Use + or string interpolation to create a new string instead.
  2. url is interpolated directly into an <a> tag without HTML escaping. If the URL contains ', <, or >, it can break the markup or introduce XSS.
Proposed fix
-    contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"
+    escaped_url = ERB::Util.html_escape(url)
+    contents = contents + "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{escaped_url}'>#{escaped_url}</a>")}</small>\n"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"
escaped_url = ERB::Util.html_escape(url)
contents = contents + "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{escaped_url}'>#{escaped_url}</a>")}</small>\n"
🤖 Prompt for AI Agents
In `@app/models/topic_embed.rb` at line 13, Replace the in-place mutation and raw
HTML interpolation: stop using the destructive operator on contents (change the
`contents << ...` usage in topic_embed.rb to build a new string instead, e.g.
assign back with `contents = contents + ...` or `contents += ...`) and
HTML-escape the `url` before embedding it into the anchor tag (use
ERB::Util.html_escape or CGI.escapeHTML on `url` and insert that escaped value
into the generated string). Ensure the final string you assign is the safe HTML
you intend to store/return.


embed = TopicEmbed.where(embed_url: url).first
content_sha1 = Digest::SHA1.hexdigest(contents)
post = nil

# If there is no embed, create a topic, post and the embed.
if embed.blank?
Topic.transaction do
creator = PostCreator.new(user, title: title, raw: absolutize_urls(url, contents), skip_validations: true, cook_method: Post.cook_methods[:raw_html])
post = creator.create
if post.present?
TopicEmbed.create!(topic_id: post.topic_id,
embed_url: url,
content_sha1: content_sha1,
post_id: post.id)
end
end
else
post = embed.post
# Update the topic if it changed
if content_sha1 != embed.content_sha1
revisor = PostRevisor.new(post)
revisor.revise!(user, absolutize_urls(url, contents), skip_validations: true, bypass_rate_limiter: true)
embed.update_column(:content_sha1, content_sha1)
end
end

post
end

def self.import_remote(user, url, opts=nil)
require 'ruby-readability'

opts = opts || {}
doc = Readability::Document.new(open(url).read,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

open(url) with an external URL — SSRF and command injection risk.

Unlike the feed polling URL (admin-controlled), this url originates from embed requests and is more attacker-accessible. Kernel#open in Ruby < 3.0 can execute shell commands via pipe (|command). Even without that, fetching arbitrary URLs is an SSRF vector (internal network scanning, cloud metadata endpoints, etc.).

Use URI.open at minimum, and consider validating/restricting the URL scheme and destination.

Proposed fix
-    doc = Readability::Document.new(open(url).read,
+    doc = Readability::Document.new(URI.open(url).read,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
doc = Readability::Document.new(open(url).read,
doc = Readability::Document.new(URI.open(url).read,
🤖 Prompt for AI Agents
In `@app/models/topic_embed.rb` at line 48, The code uses open(url) to fetch
external content which allows command execution and SSRF; in
app/models/topic_embed.rb replace Kernel#open with URI.open and add URL
validation before fetching: parse the url, ensure an allowed scheme (http or
https), reject or whitelist hosts/IPs (prevent private IPs/metadata addresses),
and only then call URI.open to read the body for Readability::Document.new
(reference: Readability::Document and the local variable url). Ensure exceptions
from invalid URLs or blocked destinations are handled and logged.

tags: %w[div p code pre h1 h2 h3 b em i strong a img],
attributes: %w[href src])

TopicEmbed.import(user, url, opts[:title] || doc.title, doc.content)
end

# Convert any relative URLs to absolute. RSS is annoying for this.
def self.absolutize_urls(url, contents)
uri = URI(url)
prefix = "#{uri.scheme}://#{uri.host}"
prefix << ":#{uri.port}" if uri.port != 80 && uri.port != 443

fragment = Nokogiri::HTML.fragment(contents)
fragment.css('a').each do |a|
href = a['href']
if href.present? && href.start_with?('/')
a['href'] = "#{prefix}/#{href.sub(/^\/+/, '')}"
end
end
fragment.css('img').each do |a|
src = a['src']
if src.present? && src.start_with?('/')
a['src'] = "#{prefix}/#{src.sub(/^\/+/, '')}"
end
end

fragment.to_html
end

def self.topic_id_for_embed(embed_url)
TopicEmbed.where(embed_url: embed_url).pluck(:topic_id).first
end

end
30 changes: 30 additions & 0 deletions app/views/embed/best.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
<header>
<%- if @topic_view.posts.present? %>
<%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%- else %>
<%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%- end if %>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Syntax error: end if without a condition is invalid Ruby.

<%- end if %> is incomplete — Ruby's post-condition modifier end if <expr> requires an expression after if. This will raise a SyntaxError at template compilation time. It should simply be <% end %> to close the if block opened on Line 2.

🐛 Proposed fix
-  <%- end if %>
+  <%- end %>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<%- end if %>
<%- end %>
🤖 Prompt for AI Agents
In `@app/views/embed/best.html.erb` at line 6, The ERB template has an invalid
Ruby post-condition clause "<%- end if %>" which causes a SyntaxError; replace
that token with a standard ERB block terminator "<% end %>" to correctly close
the preceding if block (i.e., change "<%- end if %>" -> "<% end %>" in the
template).


<%= link_to(image_tag(SiteSetting.logo_url, class: 'logo'), Discourse.base_url) %>
</header>

<%- if @topic_view.posts.present? %>
<%- @topic_view.posts.each do |post| %>
<article class='post'>
<%= link_to post.created_at.strftime("%e %b %Y"), post.url, class: 'post-date', target: "_blank" %>
<div class='author'>
<img src='<%= post.user.small_avatar_url %>'>
<h3><%= post.user.username %></h3>
</div>
Comment on lines +15 to +18

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unescaped URL interpolation in img tag risks attribute breakout.

If small_avatar_url ever contains a single quote or other special characters, the raw string interpolation inside the src='...' attribute will break the HTML or open an injection vector. Prefer image_tag or at minimum escape the value.

🛡️ Proposed fix
-        <img src='<%= post.user.small_avatar_url %>'>
+        <%= image_tag(post.user.small_avatar_url) %>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<div class='author'>
<img src='<%= post.user.small_avatar_url %>'>
<h3><%= post.user.username %></h3>
</div>
<div class='author'>
<%= image_tag(post.user.small_avatar_url) %>
<h3><%= post.user.username %></h3>
</div>
🤖 Prompt for AI Agents
In `@app/views/embed/best.html.erb` around lines 15 - 18, The img tag currently
interpolates post.user.small_avatar_url directly into src which can break HTML
if the URL contains single quotes or malicious characters; replace the raw tag
with Rails' safe helper by using image_tag(post.user.small_avatar_url, alt:
post.user.username) or at minimum wrap the value with ERB escaping
(h(post.user.small_avatar_url)) so the src attribute is properly escaped; update
the view around the author block (where post.user and small_avatar_url are used)
to use image_tag or h(...) and include an alt attribute for accessibility.

<div class='cooked'><%= raw post.cooked %></div>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -n 'raw_html' --type ruby -C 3

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 1990


🏁 Script executed:

rg -n 'absolutize_urls' --type ruby -B 2 -A 5

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 1777


🏁 Script executed:

rg -n 'TopicEmbed\.import' --type ruby -B 3 -A 10

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 4045


🏁 Script executed:

# Check for any sanitization in the TopicEmbed or embed context
rg -n 'sanitize|Sanitize' --type ruby app/models/topic_embed.rb -C 3

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 70


🏁 Script executed:

rg -n 'import_remote' --type ruby -B 2 -A 15

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 1499


🏁 Script executed:

# Check where TopicEmbed.import is called and what kind of content is passed
rg -n 'TopicEmbed\.import' --type ruby -B 5 -A 2 | head -50

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 3092


Raw HTML rendering in embed view bypasses Discourse's sanitization pipeline — this is a security vulnerability, not a safe pattern.

The raw_html cook method (Post.rb:133) explicitly returns raw untouched, skipping Discourse's sanitization pipeline entirely. post.cooked becomes identical to post.raw in this case. While TopicEmbed.import_remote applies a Readability tag/attribute whitelist, TopicEmbed.import itself has no sanitization. The vulnerable path is in poll_feed.rb:36, which calls TopicEmbed.import with RSS content that only receives CGI.unescapeHTML (entity decoding, not HTML sanitization). This unsanitized content flows through raw_html cook method directly into post.cooked, then renders in the embed view without escaping via the raw helper.

Ensure HTML sanitization via Discourse's Sanitizer (not just entity decoding) for all external content before calling TopicEmbed.import, or remove the raw_html cook method in favor of the standard sanitizing pipeline.

🤖 Prompt for AI Agents
In `@app/views/embed/best.html.erb` at line 19, The embed view renders unsanitized
HTML because the raw_html cook method (Post.rb:133) returns raw content and
TopicEmbed.import (called from poll_feed.rb:36) consumes RSS content without
sanitization; fix by ensuring external HTML is sanitized via Discourse's
Sanitizer before it reaches TopicEmbed.import or by removing/replacing the
raw_html cook method so the normal sanitization pipeline is used. Locate the
raw_html method in Post.rb and either delete/disable it or change callers (e.g.,
in poll_feed.rb and TopicEmbed.import) to call Sanitizer.sanitize (or the
appropriate Discourse::Sanitization API) on the decoded RSS content before
passing to TopicEmbed.import/TopicEmbed.import_remote, and confirm post.cooked
is produced by the standard, sanitized cook pipeline rather than returned raw.

<div style='clear: both'></div>
</article>
<%- end %>

<footer>
<%= link_to(I18n.t('embed.continue'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%= link_to(image_tag(SiteSetting.logo_url, class: 'logo'), Discourse.base_url) %>
</footer>

<% end %>

Loading